[PATCH v4] powerpc/uprobes: Validation for prefixed instruction

2021-03-05 Thread Ravi Bangoria
As per ISA 3.1, prefixed instruction should not cross 64-byte
boundary. So don't allow Uprobe on such prefixed instruction.

There are two ways probed instruction is changed in mapped pages.
First, when Uprobe is activated, it searches for all the relevant
pages and replace instruction in them. In this case, if that probe
is on the 64-byte unaligned prefixed instruction, error out
directly. Second, when Uprobe is already active and user maps a
relevant page via mmap(), instruction is replaced via mmap() code
path. But because Uprobe is invalid, entire mmap() operation can
not be stopped. In this case just print an error and continue.

Signed-off-by: Ravi Bangoria 
Acked-by: Naveen N. Rao 
---
v3: https://lore.kernel.org/r/20210304050529.59391-1-ravi.bango...@linux.ibm.com
v3->v4:
  - CONFIG_PPC64 check was not required, remove it.
  - Use SZ_ macros instead of hardcoded numbers.

 arch/powerpc/kernel/uprobes.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
index e8a63713e655..4cbfff6e94a3 100644
--- a/arch/powerpc/kernel/uprobes.c
+++ b/arch/powerpc/kernel/uprobes.c
@@ -41,6 +41,13 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe,
if (addr & 0x03)
return -EINVAL;
 
+   if (cpu_has_feature(CPU_FTR_ARCH_31) &&
+   ppc_inst_prefixed(auprobe->insn) &&
+   (addr & (SZ_64 - 4)) == SZ_64 - 4) {
+   pr_info_ratelimited("Cannot register a uprobe on 64 byte 
unaligned prefixed instruction\n");
+   return -EINVAL;
+   }
+
return 0;
 }
 
-- 
2.27.0



Re: [PATCH v3] powerpc/uprobes: Validation for prefixed instruction

2021-03-04 Thread Ravi Bangoria




On 3/4/21 4:21 PM, Christophe Leroy wrote:



Le 04/03/2021 à 11:13, Ravi Bangoria a écrit :



On 3/4/21 1:02 PM, Christophe Leroy wrote:



Le 04/03/2021 à 06:05, Ravi Bangoria a écrit :

As per ISA 3.1, prefixed instruction should not cross 64-byte
boundary. So don't allow Uprobe on such prefixed instruction.

There are two ways probed instruction is changed in mapped pages.
First, when Uprobe is activated, it searches for all the relevant
pages and replace instruction in them. In this case, if that probe
is on the 64-byte unaligned prefixed instruction, error out
directly. Second, when Uprobe is already active and user maps a
relevant page via mmap(), instruction is replaced via mmap() code
path. But because Uprobe is invalid, entire mmap() operation can
not be stopped. In this case just print an error and continue.

Signed-off-by: Ravi Bangoria 
---
v2: 
https://lore.kernel.org/r/20210204104703.273429-1-ravi.bango...@linux.ibm.com
v2->v3:
   - Drop restriction for Uprobe on suffix of prefixed instruction.
 It needs lot of code change including generic code but what
 we get in return is not worth it.

  arch/powerpc/kernel/uprobes.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
index e8a63713e655..c400971ebe70 100644
--- a/arch/powerpc/kernel/uprobes.c
+++ b/arch/powerpc/kernel/uprobes.c
@@ -41,6 +41,14 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe,
  if (addr & 0x03)
  return -EINVAL;
+    if (!IS_ENABLED(CONFIG_PPC64) || !cpu_has_feature(CPU_FTR_ARCH_31))


cpu_has_feature(CPU_FTR_ARCH_31) should return 'false' when CONFIG_PPC64 is not 
enabled, no need to double check.


Ok.

I'm going to drop CONFIG_PPC64 check because it's not really
required as I replied to Naveen. So, I'll keep CPU_FTR_ARCH_31
check as is.




+    return 0;
+
+    if (ppc_inst_prefixed(auprobe->insn) && (addr & 0x3F) == 0x3C) {


Maybe 3C instead of 4F ? : (addr & 0x3C) == 0x3C


Didn't follow. It's not (addr & 0x3C), it's (addr & 0x3F).


Sorry I meant 3c instead of 3f (And usually we don't use capital letters for 
that).
The last two bits are supposed to always be 0, so it doesn't really matter, I just 
thought it would look better having the same value both sides of the test, ie (addr 
& 0x3c) == 0x3c.


Ok yeah makes sense. Thanks.







What about

(addr & (SZ_64 - 4)) == SZ_64 - 4 to make it more explicit ?


Yes this is bit better. Though, it should be:

 (addr & (SZ_64 - 1)) == SZ_64 - 4


-1 or -4 should give the same results as instructions are always 32 bits 
aligned though.


Got it.

Ravi


Re: [PATCH v3] powerpc/uprobes: Validation for prefixed instruction

2021-03-04 Thread Ravi Bangoria




On 3/4/21 1:02 PM, Christophe Leroy wrote:



Le 04/03/2021 à 06:05, Ravi Bangoria a écrit :

As per ISA 3.1, prefixed instruction should not cross 64-byte
boundary. So don't allow Uprobe on such prefixed instruction.

There are two ways probed instruction is changed in mapped pages.
First, when Uprobe is activated, it searches for all the relevant
pages and replace instruction in them. In this case, if that probe
is on the 64-byte unaligned prefixed instruction, error out
directly. Second, when Uprobe is already active and user maps a
relevant page via mmap(), instruction is replaced via mmap() code
path. But because Uprobe is invalid, entire mmap() operation can
not be stopped. In this case just print an error and continue.

Signed-off-by: Ravi Bangoria 
---
v2: 
https://lore.kernel.org/r/20210204104703.273429-1-ravi.bango...@linux.ibm.com
v2->v3:
   - Drop restriction for Uprobe on suffix of prefixed instruction.
 It needs lot of code change including generic code but what
 we get in return is not worth it.

  arch/powerpc/kernel/uprobes.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
index e8a63713e655..c400971ebe70 100644
--- a/arch/powerpc/kernel/uprobes.c
+++ b/arch/powerpc/kernel/uprobes.c
@@ -41,6 +41,14 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe,
  if (addr & 0x03)
  return -EINVAL;
+    if (!IS_ENABLED(CONFIG_PPC64) || !cpu_has_feature(CPU_FTR_ARCH_31))


cpu_has_feature(CPU_FTR_ARCH_31) should return 'false' when CONFIG_PPC64 is not 
enabled, no need to double check.


Ok.

I'm going to drop CONFIG_PPC64 check because it's not really
required as I replied to Naveen. So, I'll keep CPU_FTR_ARCH_31
check as is.




+    return 0;
+
+    if (ppc_inst_prefixed(auprobe->insn) && (addr & 0x3F) == 0x3C) {


Maybe 3C instead of 4F ? : (addr & 0x3C) == 0x3C


Didn't follow. It's not (addr & 0x3C), it's (addr & 0x3F).



What about

(addr & (SZ_64 - 4)) == SZ_64 - 4 to make it more explicit ?


Yes this is bit better. Though, it should be:

(addr & (SZ_64 - 1)) == SZ_64 - 4

Ravi


Re: [PATCH v3] powerpc/uprobes: Validation for prefixed instruction

2021-03-03 Thread Ravi Bangoria




@@ -41,6 +41,14 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe,
if (addr & 0x03)
return -EINVAL;
  
+	if (!IS_ENABLED(CONFIG_PPC64) || !cpu_has_feature(CPU_FTR_ARCH_31))

+   return 0;


Sorry, I missed this last time, but I think we can drop the above
checks. ppc_inst_prefixed() already factors in the dependency on
CONFIG_PPC64,


Yeah, makes sense. I initially added CONFIG_PPC64 check because
I was using ppc_inst_prefix(x, y) macro which is not available
for !CONFIG_PPC64.


and I don't think we need to confirm if we're running on a
ISA V3.1 for the below check.


Prefixed instructions would be supported only when ARCH_31 is set.
(Ignoring insane scenario where user probes on prefixed instruction
on p10 predecessors). So I guess I still need ARCH_31 check?

Ravi


[PATCH v3] powerpc/uprobes: Validation for prefixed instruction

2021-03-03 Thread Ravi Bangoria
As per ISA 3.1, prefixed instruction should not cross 64-byte
boundary. So don't allow Uprobe on such prefixed instruction.

There are two ways probed instruction is changed in mapped pages.
First, when Uprobe is activated, it searches for all the relevant
pages and replace instruction in them. In this case, if that probe
is on the 64-byte unaligned prefixed instruction, error out
directly. Second, when Uprobe is already active and user maps a
relevant page via mmap(), instruction is replaced via mmap() code
path. But because Uprobe is invalid, entire mmap() operation can
not be stopped. In this case just print an error and continue.

Signed-off-by: Ravi Bangoria 
---
v2: 
https://lore.kernel.org/r/20210204104703.273429-1-ravi.bango...@linux.ibm.com
v2->v3:
  - Drop restriction for Uprobe on suffix of prefixed instruction.
It needs lot of code change including generic code but what
we get in return is not worth it.

 arch/powerpc/kernel/uprobes.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
index e8a63713e655..c400971ebe70 100644
--- a/arch/powerpc/kernel/uprobes.c
+++ b/arch/powerpc/kernel/uprobes.c
@@ -41,6 +41,14 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe,
if (addr & 0x03)
return -EINVAL;
 
+   if (!IS_ENABLED(CONFIG_PPC64) || !cpu_has_feature(CPU_FTR_ARCH_31))
+   return 0;
+
+   if (ppc_inst_prefixed(auprobe->insn) && (addr & 0x3F) == 0x3C) {
+   pr_info_ratelimited("Cannot register a uprobe on 64 byte 
unaligned prefixed instruction\n");
+   return -EINVAL;
+   }
+
return 0;
 }
 
-- 
2.27.0



Re: [PATCH] powerpc/sstep: Fix VSX instruction emulation

2021-02-26 Thread Ravi Bangoria




On 2/25/21 8:49 AM, Jordan Niethe wrote:

Commit af99da74333b ("powerpc/sstep: Support VSX vector paired storage
access instructions") added loading and storing 32 word long data into
adjacent VSRs. However the calculation used to determine if two VSRs
needed to be loaded/stored inadvertently prevented the load/storing
taking place for instructions with a data length less than 16 words.

This causes the emulation to not function correctly, which can be seen
by the alignment_handler selftest:

$ ./alignment_handler
[snip]
test: test_alignment_handler_vsx_207
tags: git_version:powerpc-5.12-1-0-g82d2c16b350f
VSX: 2.07B
 Doing lxsspx:   PASSED
 Doing lxsiwax:  FAILED: Wrong Data
 Doing lxsiwzx:  PASSED
 Doing stxsspx:  PASSED
 Doing stxsiwx:  PASSED
failure: test_alignment_handler_vsx_207
test: test_alignment_handler_vsx_300
tags: git_version:powerpc-5.12-1-0-g82d2c16b350f
VSX: 3.00B
 Doing lxsd: PASSED
 Doing lxsibzx:  PASSED
 Doing lxsihzx:  PASSED
 Doing lxssp:FAILED: Wrong Data
 Doing lxv:  PASSED
 Doing lxvb16x:  PASSED
 Doing lxvh8x:   PASSED
 Doing lxvx: PASSED
 Doing lxvwsx:   FAILED: Wrong Data
 Doing lxvl: PASSED
 Doing lxvll:PASSED
 Doing stxsd:PASSED
 Doing stxsibx:  PASSED
 Doing stxsihx:  PASSED
 Doing stxssp:   PASSED
 Doing stxv: PASSED
 Doing stxvb16x: PASSED
 Doing stxvh8x:  PASSED
 Doing stxvx:PASSED
 Doing stxvl:PASSED
 Doing stxvll:   PASSED
failure: test_alignment_handler_vsx_300
[snip]

Fix this by making sure all VSX instruction emulation correctly
load/store from the VSRs.

Fixes: af99da74333b ("powerpc/sstep: Support VSX vector paired storage access 
instructions")
Signed-off-by: Jordan Niethe 


Yikes!

Reviewed-by: Ravi Bangoria 


Re: [PATCH v2] powerpc/uprobes: Validation for prefixed instruction

2021-02-08 Thread Ravi Bangoria




On 2/4/21 6:38 PM, Naveen N. Rao wrote:

On 2021/02/04 04:17PM, Ravi Bangoria wrote:

Don't allow Uprobe on 2nd word of a prefixed instruction. As per
ISA 3.1, prefixed instruction should not cross 64-byte boundary.
So don't allow Uprobe on such prefixed instruction as well.

There are two ways probed instruction is changed in mapped pages.
First, when Uprobe is activated, it searches for all the relevant
pages and replace instruction in them. In this case, if we notice
that probe is on the 2nd word of prefixed instruction, error out
directly. Second, when Uprobe is already active and user maps a
relevant page via mmap(), instruction is replaced via mmap() code
path. But because Uprobe is invalid, entire mmap() operation can
not be stopped. In this case just print an error and continue.

Signed-off-by: Ravi Bangoria 
---
v1: http://lore.kernel.org/r/20210119091234.76317-1-ravi.bango...@linux.ibm.com
v1->v2:
   - Instead of introducing new arch hook from verify_opcode(), use
 existing hook arch_uprobe_analyze_insn().
   - Add explicit check for prefixed instruction crossing 64-byte
 boundary. If probe is on such instruction, throw an error.

  arch/powerpc/kernel/uprobes.c | 66 ++-
  1 file changed, 65 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
index e8a63713e655..485d19a2a31f 100644
--- a/arch/powerpc/kernel/uprobes.c
+++ b/arch/powerpc/kernel/uprobes.c
@@ -7,6 +7,7 @@
   * Adapted from the x86 port by Ananth N Mavinakayanahalli 
   */
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -28,6 +29,69 @@ bool is_trap_insn(uprobe_opcode_t *insn)
return (is_trap(*insn));
  }
  
+#ifdef CONFIG_PPC64

+static int get_instr(struct mm_struct *mm, unsigned long addr, u32 *instr)
+{
+   struct page *page;
+   struct vm_area_struct *vma;
+   void *kaddr;
+   unsigned int gup_flags = FOLL_FORCE | FOLL_SPLIT_PMD;
+
+   if (get_user_pages_remote(mm, addr, 1, gup_flags, , , NULL) <= 
0)
+   return -EINVAL;
+
+   kaddr = kmap_atomic(page);
+   *instr = *((u32 *)(kaddr + (addr & ~PAGE_MASK)));
+   kunmap_atomic(kaddr);
+   put_page(page);
+   return 0;
+}
+
+static int validate_prefixed_instr(struct mm_struct *mm, unsigned long addr)
+{
+   struct ppc_inst inst;
+   u32 prefix, suffix;
+
+   /*
+* No need to check if addr is pointing to beginning of the
+* page. Even if probe is on a suffix of page-unaligned
+* prefixed instruction, hw will raise exception and kernel
+* will send SIGBUS.
+*/
+   if (!(addr & ~PAGE_MASK))
+   return 0;
+
+   if (get_instr(mm, addr, ) < 0)
+   return -EINVAL;
+   if (get_instr(mm, addr + 4, ) < 0)
+   return -EINVAL;
+
+   inst = ppc_inst_prefix(prefix, suffix);
+   if (ppc_inst_prefixed(inst) && (addr & 0x3F) == 0x3C) {
+   printk_ratelimited("Cannot register a uprobe on 64 byte "

^^ pr_info_ratelimited()

It should be sufficient to check the primary opcode to determine if it
is a prefixed instruction. You don't have to read the suffix. I see that
we don't have a helper to do this currently, so you could do:

if (ppc_inst_primary_opcode(ppc_inst(prefix)) == 1)


Ok.



In the future, it might be worthwhile to add IS_PREFIX() as a macro
similar to IS_MTMSRD() if there are more such uses.

Along with this, if you also add the below to the start of this
function, you can get rid of the #ifdef:

if (!IS_ENABLED(CONFIG_PPC64))
return 0;


Yeah this is better.

Thanks for the review, Naveen!
Ravi


Re: [PATCH v2] powerpc/uprobes: Validation for prefixed instruction

2021-02-08 Thread Ravi Bangoria




On 2/4/21 9:42 PM, Naveen N. Rao wrote:

On 2021/02/04 06:38PM, Naveen N. Rao wrote:

On 2021/02/04 04:17PM, Ravi Bangoria wrote:

Don't allow Uprobe on 2nd word of a prefixed instruction. As per
ISA 3.1, prefixed instruction should not cross 64-byte boundary.
So don't allow Uprobe on such prefixed instruction as well.

There are two ways probed instruction is changed in mapped pages.
First, when Uprobe is activated, it searches for all the relevant
pages and replace instruction in them. In this case, if we notice
that probe is on the 2nd word of prefixed instruction, error out
directly. Second, when Uprobe is already active and user maps a
relevant page via mmap(), instruction is replaced via mmap() code
path. But because Uprobe is invalid, entire mmap() operation can
not be stopped. In this case just print an error and continue.

Signed-off-by: Ravi Bangoria 
---
v1: http://lore.kernel.org/r/20210119091234.76317-1-ravi.bango...@linux.ibm.com
v1->v2:
   - Instead of introducing new arch hook from verify_opcode(), use
 existing hook arch_uprobe_analyze_insn().
   - Add explicit check for prefixed instruction crossing 64-byte
 boundary. If probe is on such instruction, throw an error.

  arch/powerpc/kernel/uprobes.c | 66 ++-
  1 file changed, 65 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
index e8a63713e655..485d19a2a31f 100644
--- a/arch/powerpc/kernel/uprobes.c
+++ b/arch/powerpc/kernel/uprobes.c
@@ -7,6 +7,7 @@
   * Adapted from the x86 port by Ananth N Mavinakayanahalli 
   */
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -28,6 +29,69 @@ bool is_trap_insn(uprobe_opcode_t *insn)
return (is_trap(*insn));
  }
  
+#ifdef CONFIG_PPC64

+static int get_instr(struct mm_struct *mm, unsigned long addr, u32 *instr)
+{
+   struct page *page;
+   struct vm_area_struct *vma;
+   void *kaddr;
+   unsigned int gup_flags = FOLL_FORCE | FOLL_SPLIT_PMD;
+
+   if (get_user_pages_remote(mm, addr, 1, gup_flags, , , NULL) <= 
0)
+   return -EINVAL;
+
+   kaddr = kmap_atomic(page);
+   *instr = *((u32 *)(kaddr + (addr & ~PAGE_MASK)));
+   kunmap_atomic(kaddr);
+   put_page(page);
+   return 0;
+}
+
+static int validate_prefixed_instr(struct mm_struct *mm, unsigned long addr)
+{
+   struct ppc_inst inst;
+   u32 prefix, suffix;
+
+   /*
+* No need to check if addr is pointing to beginning of the
+* page. Even if probe is on a suffix of page-unaligned
+* prefixed instruction, hw will raise exception and kernel
+* will send SIGBUS.
+*/
+   if (!(addr & ~PAGE_MASK))
+   return 0;
+
+   if (get_instr(mm, addr, ) < 0)
+   return -EINVAL;
+   if (get_instr(mm, addr + 4, ) < 0)
+   return -EINVAL;
+
+   inst = ppc_inst_prefix(prefix, suffix);
+   if (ppc_inst_prefixed(inst) && (addr & 0x3F) == 0x3C) {
+   printk_ratelimited("Cannot register a uprobe on 64 byte "

^^ pr_info_ratelimited()

It should be sufficient to check the primary opcode to determine if it
is a prefixed instruction. You don't have to read the suffix. I see that
we don't have a helper to do this currently, so you could do:

if (ppc_inst_primary_opcode(ppc_inst(prefix)) == 1)


Seeing the kprobes code, I realized that we have to check for another
scenario (Thanks, Jordan!). If this is the suffix of a prefix
instruction for which a uprobe has already been installed, then the
previous word will be a 'trap' instruction. You need to check if there
is a uprobe at the previous word, and if the original instruction there
was a prefix instruction.


Yes, this patch will fail to detect such scenario. I think I should
read the instruction directly from file, like what copy_insn() does.
With that, I'll get original instruction rather that 'trap'.

I'll think more along this line.

Ravi


Re: [PATCH v2] powerpc/uprobes: Validation for prefixed instruction

2021-02-08 Thread Ravi Bangoria




On 2/4/21 6:45 PM, Naveen N. Rao wrote:

On 2021/02/04 04:19PM, Ravi Bangoria wrote:



On 2/4/21 4:17 PM, Ravi Bangoria wrote:

Don't allow Uprobe on 2nd word of a prefixed instruction. As per
ISA 3.1, prefixed instruction should not cross 64-byte boundary.
So don't allow Uprobe on such prefixed instruction as well.

There are two ways probed instruction is changed in mapped pages.
First, when Uprobe is activated, it searches for all the relevant
pages and replace instruction in them. In this case, if we notice
that probe is on the 2nd word of prefixed instruction, error out
directly. Second, when Uprobe is already active and user maps a
relevant page via mmap(), instruction is replaced via mmap() code
path. But because Uprobe is invalid, entire mmap() operation can
not be stopped. In this case just print an error and continue.


@mpe,

arch_uprobe_analyze_insn() can return early if
cpu_has_feature(CPU_FTR_ARCH_31) is not set. But that will
miss out a rare scenario of user running binary with prefixed
instruction on p10 predecessors. Please let me know if I
should add cpu_has_feature(CPU_FTR_ARCH_31) or not.


The check you are adding is very specific to prefixed instructions, so
it makes sense to add a cpu feature check for v3.1.

On older processors, those are invalid instructions like any other. The
instruction emulation infrastructure will refuse to emulate it and the
instruction will be single stepped.


Sure will add it.

Ravi


Re: [PATCH v2] powerpc/uprobes: Validation for prefixed instruction

2021-02-08 Thread Ravi Bangoria




On 2/6/21 11:36 PM, Oleg Nesterov wrote:

On 02/04, Ravi Bangoria wrote:


+static int get_instr(struct mm_struct *mm, unsigned long addr, u32 *instr)
+{
+   struct page *page;
+   struct vm_area_struct *vma;
+   void *kaddr;
+   unsigned int gup_flags = FOLL_FORCE | FOLL_SPLIT_PMD;
+
+   if (get_user_pages_remote(mm, addr, 1, gup_flags, , , NULL) <= 
0)
+   return -EINVAL;


"vma" is not used,


Ok.


and I don't think you need FOLL_SPLIT_PMD.

Isn't it needed if the target page is hugepage?


Otherwise I can't really comment this ppc-specific change.

To be honest, I don't even understand why do we need this fix. Sure, the
breakpoint in the middle of 64-bit insn won't work, why do we care? The
user should know what does he do.


That's a valid point. This patch is to protract user from doing
invalid thing.

Though, there is one minor scenario where this patch will help. If
the original prefixed instruction is 64 byte unaligned, and say
user probes it, Uprobe infrastructure will emulate such instruction
transparently without notifying user that the instruction is
improperly aligned.


Not to mention we can't really trust get_user_pages() in that this page
can be modified by mm owner or debugger...


As Naveen pointed out, there might be existing uprobe on the prefix
and this patch will fail to detect such scenario. So I'm thinking to
read the instruction directly from file backed page (like copy_insn),
in which case I won't use get_user_pages().

Thanks Oleg for the review!

Ravi


Re: [PATCH v2] powerpc/uprobes: Validation for prefixed instruction

2021-02-04 Thread Ravi Bangoria




On 2/4/21 4:17 PM, Ravi Bangoria wrote:

Don't allow Uprobe on 2nd word of a prefixed instruction. As per
ISA 3.1, prefixed instruction should not cross 64-byte boundary.
So don't allow Uprobe on such prefixed instruction as well.

There are two ways probed instruction is changed in mapped pages.
First, when Uprobe is activated, it searches for all the relevant
pages and replace instruction in them. In this case, if we notice
that probe is on the 2nd word of prefixed instruction, error out
directly. Second, when Uprobe is already active and user maps a
relevant page via mmap(), instruction is replaced via mmap() code
path. But because Uprobe is invalid, entire mmap() operation can
not be stopped. In this case just print an error and continue.


@mpe,

arch_uprobe_analyze_insn() can return early if
cpu_has_feature(CPU_FTR_ARCH_31) is not set. But that will
miss out a rare scenario of user running binary with prefixed
instruction on p10 predecessors. Please let me know if I
should add cpu_has_feature(CPU_FTR_ARCH_31) or not.

- Ravi


[PATCH v2] powerpc/uprobes: Validation for prefixed instruction

2021-02-04 Thread Ravi Bangoria
Don't allow Uprobe on 2nd word of a prefixed instruction. As per
ISA 3.1, prefixed instruction should not cross 64-byte boundary.
So don't allow Uprobe on such prefixed instruction as well.

There are two ways probed instruction is changed in mapped pages.
First, when Uprobe is activated, it searches for all the relevant
pages and replace instruction in them. In this case, if we notice
that probe is on the 2nd word of prefixed instruction, error out
directly. Second, when Uprobe is already active and user maps a
relevant page via mmap(), instruction is replaced via mmap() code
path. But because Uprobe is invalid, entire mmap() operation can
not be stopped. In this case just print an error and continue.

Signed-off-by: Ravi Bangoria 
---
v1: http://lore.kernel.org/r/20210119091234.76317-1-ravi.bango...@linux.ibm.com
v1->v2:
  - Instead of introducing new arch hook from verify_opcode(), use
existing hook arch_uprobe_analyze_insn().
  - Add explicit check for prefixed instruction crossing 64-byte
boundary. If probe is on such instruction, throw an error.

 arch/powerpc/kernel/uprobes.c | 66 ++-
 1 file changed, 65 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
index e8a63713e655..485d19a2a31f 100644
--- a/arch/powerpc/kernel/uprobes.c
+++ b/arch/powerpc/kernel/uprobes.c
@@ -7,6 +7,7 @@
  * Adapted from the x86 port by Ananth N Mavinakayanahalli 
  */
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -28,6 +29,69 @@ bool is_trap_insn(uprobe_opcode_t *insn)
return (is_trap(*insn));
 }
 
+#ifdef CONFIG_PPC64
+static int get_instr(struct mm_struct *mm, unsigned long addr, u32 *instr)
+{
+   struct page *page;
+   struct vm_area_struct *vma;
+   void *kaddr;
+   unsigned int gup_flags = FOLL_FORCE | FOLL_SPLIT_PMD;
+
+   if (get_user_pages_remote(mm, addr, 1, gup_flags, , , NULL) <= 
0)
+   return -EINVAL;
+
+   kaddr = kmap_atomic(page);
+   *instr = *((u32 *)(kaddr + (addr & ~PAGE_MASK)));
+   kunmap_atomic(kaddr);
+   put_page(page);
+   return 0;
+}
+
+static int validate_prefixed_instr(struct mm_struct *mm, unsigned long addr)
+{
+   struct ppc_inst inst;
+   u32 prefix, suffix;
+
+   /*
+* No need to check if addr is pointing to beginning of the
+* page. Even if probe is on a suffix of page-unaligned
+* prefixed instruction, hw will raise exception and kernel
+* will send SIGBUS.
+*/
+   if (!(addr & ~PAGE_MASK))
+   return 0;
+
+   if (get_instr(mm, addr, ) < 0)
+   return -EINVAL;
+   if (get_instr(mm, addr + 4, ) < 0)
+   return -EINVAL;
+
+   inst = ppc_inst_prefix(prefix, suffix);
+   if (ppc_inst_prefixed(inst) && (addr & 0x3F) == 0x3C) {
+   printk_ratelimited("Cannot register a uprobe on 64 byte "
+  "unaligned prefixed instruction\n");
+   return -EINVAL;
+   }
+
+   suffix = prefix;
+   if (get_instr(mm, addr - 4, ) < 0)
+   return -EINVAL;
+
+   inst = ppc_inst_prefix(prefix, suffix);
+   if (ppc_inst_prefixed(inst)) {
+   printk_ratelimited("Cannot register a uprobe on the second "
+  "word of prefixed instruction\n");
+   return -EINVAL;
+   }
+   return 0;
+}
+#else
+static int validate_prefixed_instr(struct mm_struct *mm, unsigned long addr)
+{
+   return 0;
+}
+#endif
+
 /**
  * arch_uprobe_analyze_insn
  * @mm: the probed address space.
@@ -41,7 +105,7 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe,
if (addr & 0x03)
return -EINVAL;
 
-   return 0;
+   return validate_prefixed_instr(mm, addr);
 }
 
 /*
-- 
2.26.2



Re: [PATCH] powerpc/sstep: Fix array out of bound warning

2021-01-28 Thread Ravi Bangoria




On 1/28/21 10:50 PM, Naveen N. Rao wrote:

On 2021/01/15 11:46AM, Ravi Bangoria wrote:

Compiling kernel with -Warray-bounds throws below warning:

   In function 'emulate_vsx_store':
   warning: array subscript is above array bounds [-Warray-bounds]
   buf.d[2] = byterev_8(reg->d[1]);
   ~^~~
   buf.d[3] = byterev_8(reg->d[0]);
   ~^~~

Fix it by converting local variable 'union vsx_reg buf' into an array.
Also consider function argument 'union vsx_reg *reg' as array instead
of pointer because callers are actually passing an array to it.


I think you should change the function prototype to reflect this.

However, while I agree with this change in principle, it looks to be a
lot of code churn for a fairly narrow use. Perhaps we should just
address the specific bug. Something like the below (not tested)?


Yes, this would serve the same purpose and it's more compact as well. Sent v2.



@@ -818,13 +818,15 @@ void emulate_vsx_store(struct instruction_op *op, const 
union vsx_reg *reg,
 break;
 if (rev) {
 /* reverse 32 bytes */
-   buf.d[0] = byterev_8(reg->d[3]);
-   buf.d[1] = byterev_8(reg->d[2]);
-   buf.d[2] = byterev_8(reg->d[1]);
-   buf.d[3] = byterev_8(reg->d[0]);
-   reg = 
+   union vsx_reg buf32[2];
+   buf32[0].d[0] = byterev_8(reg[1].d[1]);
+   buf32[0].d[1] = byterev_8(reg[1].d[0]);
+   buf32[1].d[0] = byterev_8(reg[0].d[1]);
+   buf32[1].d[1] = byterev_8(reg[0].d[0]);
+   memcpy(mem, buf32, size);
+   } else {
+   memcpy(mem, reg, size);
 }
-   memcpy(mem, reg, size);
 break;
 case 16:
 /* stxv, stxvx, stxvl, stxvll */


- Naveen



[PATCH v2] powerpc/sstep: Fix array out of bound warning

2021-01-28 Thread Ravi Bangoria
Compiling kernel with -Warray-bounds throws below warning:

  In function 'emulate_vsx_store':
  warning: array subscript is above array bounds [-Warray-bounds]
  buf.d[2] = byterev_8(reg->d[1]);
  ~^~~
  buf.d[3] = byterev_8(reg->d[0]);
  ~^~~

Fix it by using temporary array variable 'union vsx_reg buf32[]' in
that code block. Also, with element_size = 32, 'union vsx_reg *reg'
is an array of size 2. So, use 'reg' as an array instead of pointer
in the same code block.

Fixes: af99da74333b ("powerpc/sstep: Support VSX vector paired storage access 
instructions")
Suggested-by: Naveen N. Rao 
Signed-off-by: Ravi Bangoria 
---
v1: http://lore.kernel.org/r/20210115061620.692500-1-ravi.bango...@linux.ibm.com
v1->v2:
  - Change code only in the affected block

 arch/powerpc/lib/sstep.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index bf7a7d62ae8b..ede093e96234 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -818,13 +818,15 @@ void emulate_vsx_store(struct instruction_op *op, const 
union vsx_reg *reg,
break;
if (rev) {
/* reverse 32 bytes */
-   buf.d[0] = byterev_8(reg->d[3]);
-   buf.d[1] = byterev_8(reg->d[2]);
-   buf.d[2] = byterev_8(reg->d[1]);
-   buf.d[3] = byterev_8(reg->d[0]);
-   reg = 
+   union vsx_reg buf32[2];
+   buf32[0].d[0] = byterev_8(reg[1].d[1]);
+   buf32[0].d[1] = byterev_8(reg[1].d[0]);
+   buf32[1].d[0] = byterev_8(reg[0].d[1]);
+   buf32[1].d[1] = byterev_8(reg[0].d[0]);
+   memcpy(mem, buf32, size);
+   } else {
+   memcpy(mem, reg, size);
}
-   memcpy(mem, reg, size);
break;
case 16:
/* stxv, stxvx, stxvl, stxvll */
-- 
2.26.2



Re: [PATCH] powerpc/uprobes: Don't allow probe on suffix of prefixed instruction

2021-01-20 Thread Ravi Bangoria




On 1/19/21 10:56 PM, Oleg Nesterov wrote:

On 01/19, Ravi Bangoria wrote:


Probe on 2nd word of a prefixed instruction is invalid scenario and
should be restricted.


I don't understand this ppc-specific problem, but...


So far (upto Power9), instruction size was fixed - 4 bytes. But Power10
introduced a prefixed instruction which consist of 8 bytes, where first
4 bytes is prefix and remaining is suffix.

This patch checks whether the Uprobe is on the 2nd word (suffix) of a
prefixed instruction. If so, consider it as invalid Uprobe.




+#ifdef CONFIG_PPC64
+int arch_uprobe_verify_opcode(struct page *page, unsigned long vaddr,
+ uprobe_opcode_t opcode)
+{
+   uprobe_opcode_t prefix;
+   void *kaddr;
+   struct ppc_inst inst;
+
+   /* Don't check if vaddr is pointing to the beginning of page */
+   if (!(vaddr & ~PAGE_MASK))
+   return 0;


So the fix is incomplete? Or insn at the start of page can't be prefixed?


Prefixed instruction can not cross 64 byte boundary. If it does, kernel
generates SIGBUS. Considering all powerpc supported page sizes to be
multiple of 64 bytes, there will never be a scenario where prefix and
suffix will be on different pages. i.e. a beginning of the page should
never be a suffix.




+int __weak arch_uprobe_verify_opcode(struct page *page, unsigned long vaddr,
+uprobe_opcode_t opcode)
+{
+   return 0;
+}
+
  static int verify_opcode(struct page *page, unsigned long vaddr, 
uprobe_opcode_t *new_opcode)
  {
uprobe_opcode_t old_opcode;
@@ -275,6 +281,8 @@ static int verify_opcode(struct page *page, unsigned long 
vaddr, uprobe_opcode_t
if (is_swbp_insn(new_opcode)) {
if (is_swbp)/* register: already installed? */
return 0;
+   if (arch_uprobe_verify_opcode(page, vaddr, old_opcode))
+   return -EINVAL;


Well, this doesn't look good...

To me it would be better to change the prepare_uprobe() path to copy
the potential prefix into uprobe->arch and check ppc_inst_prefixed()
in arch_uprobe_analyze_insn(). What do you think?


Agreed. The only reason I was checking via verify_opcode() is to make the
code more simpler. If I need to check via prepare_uprobe(), I'll need to
abuse uprobe->offset by setting it to uprobe->offset - 4 to read previous
4 bytes of current instruction. Which, IMHO, is not that straightforward
with current implementation of prepare_uprobe().

But while replying here, I'm thinking... I should be able to grab a page
using mm and vaddr, which are already available in arch_uprobe_analyze_insn().
With that, I should be able to do all this inside arch_uprobe_analyze_insn()
only. I'll try this and send v2 if that works.

Thanks for the review.
Ravi


[PATCH] powerpc/uprobes: Don't allow probe on suffix of prefixed instruction

2021-01-19 Thread Ravi Bangoria
Probe on 2nd word of a prefixed instruction is invalid scenario and
should be restricted.

There are two ways probed instruction is changed in mapped pages.
First, when Uprobe is activated, it searches for all the relevant
pages and replace instruction in them. In this case, if we notice
that probe is on the 2nd word of prefixed instruction, error out
directly. Second, when Uprobe is already active and user maps a
relevant page via mmap(), instruction is replaced via mmap() code
path. But because Uprobe is invalid, entire mmap() operation can
not be stopped. In this case just print an error and continue.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/kernel/uprobes.c | 28 
 include/linux/uprobes.h   |  1 +
 kernel/events/uprobes.c   |  8 
 3 files changed, 37 insertions(+)

diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
index e8a63713e655..c73d5a397164 100644
--- a/arch/powerpc/kernel/uprobes.c
+++ b/arch/powerpc/kernel/uprobes.c
@@ -7,6 +7,7 @@
  * Adapted from the x86 port by Ananth N Mavinakayanahalli 
  */
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -44,6 +45,33 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe,
return 0;
 }
 
+#ifdef CONFIG_PPC64
+int arch_uprobe_verify_opcode(struct page *page, unsigned long vaddr,
+ uprobe_opcode_t opcode)
+{
+   uprobe_opcode_t prefix;
+   void *kaddr;
+   struct ppc_inst inst;
+
+   /* Don't check if vaddr is pointing to the beginning of page */
+   if (!(vaddr & ~PAGE_MASK))
+   return 0;
+
+   kaddr = kmap_atomic(page);
+   memcpy(, kaddr + ((vaddr - 4) & ~PAGE_MASK), 
UPROBE_SWBP_INSN_SIZE);
+   kunmap_atomic(kaddr);
+
+   inst = ppc_inst_prefix(prefix, opcode);
+
+   if (ppc_inst_prefixed(inst)) {
+   printk_ratelimited("Cannot register a uprobe on the second "
+  "word of prefixed instruction\n");
+   return -1;
+   }
+   return 0;
+}
+#endif
+
 /*
  * arch_uprobe_pre_xol - prepare to execute out of line.
  * @auprobe: the probepoint information.
diff --git a/include/linux/uprobes.h b/include/linux/uprobes.h
index f46e0ca0169c..5a3b45878e13 100644
--- a/include/linux/uprobes.h
+++ b/include/linux/uprobes.h
@@ -128,6 +128,7 @@ extern bool uprobe_deny_signal(void);
 extern bool arch_uprobe_skip_sstep(struct arch_uprobe *aup, struct pt_regs 
*regs);
 extern void uprobe_clear_state(struct mm_struct *mm);
 extern int  arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct 
*mm, unsigned long addr);
+int arch_uprobe_verify_opcode(struct page *page, unsigned long vaddr, 
uprobe_opcode_t opcode);
 extern int  arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs);
 extern int  arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs 
*regs);
 extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index bf9edd8d75be..be02e6c26e3f 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -255,6 +255,12 @@ static void copy_to_page(struct page *page, unsigned long 
vaddr, const void *src
kunmap_atomic(kaddr);
 }
 
+int __weak arch_uprobe_verify_opcode(struct page *page, unsigned long vaddr,
+uprobe_opcode_t opcode)
+{
+   return 0;
+}
+
 static int verify_opcode(struct page *page, unsigned long vaddr, 
uprobe_opcode_t *new_opcode)
 {
uprobe_opcode_t old_opcode;
@@ -275,6 +281,8 @@ static int verify_opcode(struct page *page, unsigned long 
vaddr, uprobe_opcode_t
if (is_swbp_insn(new_opcode)) {
if (is_swbp)/* register: already installed? */
return 0;
+   if (arch_uprobe_verify_opcode(page, vaddr, old_opcode))
+   return -EINVAL;
} else {
if (!is_swbp)   /* unregister: was it changed by us? */
return 0;
-- 
2.26.2



[PATCH] powerpc/sstep: Fix array out of bound warning

2021-01-14 Thread Ravi Bangoria
Compiling kernel with -Warray-bounds throws below warning:

  In function 'emulate_vsx_store':
  warning: array subscript is above array bounds [-Warray-bounds]
  buf.d[2] = byterev_8(reg->d[1]);
  ~^~~
  buf.d[3] = byterev_8(reg->d[0]);
  ~^~~

Fix it by converting local variable 'union vsx_reg buf' into an array.
Also consider function argument 'union vsx_reg *reg' as array instead
of pointer because callers are actually passing an array to it.

Fixes: af99da74333b ("powerpc/sstep: Support VSX vector paired storage access 
instructions")
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/lib/sstep.c | 61 
 1 file changed, 31 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index bf7a7d62ae8b..5b4281ade5b6 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -723,7 +723,8 @@ void emulate_vsx_load(struct instruction_op *op, union 
vsx_reg *reg,
const unsigned char *bp;
 
size = GETSIZE(op->type);
-   reg->d[0] = reg->d[1] = 0;
+   reg[0].d[0] = reg[0].d[1] = 0;
+   reg[1].d[0] = reg[1].d[1] = 0;
 
switch (op->element_size) {
case 32:
@@ -742,25 +743,25 @@ void emulate_vsx_load(struct instruction_op *op, union 
vsx_reg *reg,
/* scalar loads, lxvd2x, lxvdsx */
read_size = (size >= 8) ? 8 : size;
i = IS_LE ? 8 : 8 - read_size;
-   memcpy(>b[i], mem, read_size);
+   memcpy([0].b[i], mem, read_size);
if (rev)
-   do_byte_reverse(>b[i], 8);
+   do_byte_reverse([0].b[i], 8);
if (size < 8) {
if (op->type & SIGNEXT) {
/* size == 4 is the only case here */
-   reg->d[IS_LE] = (signed int) reg->d[IS_LE];
+   reg[0].d[IS_LE] = (signed int)reg[0].d[IS_LE];
} else if (op->vsx_flags & VSX_FPCONV) {
preempt_disable();
-   conv_sp_to_dp(>fp[1 + IS_LE],
- >dp[IS_LE]);
+   conv_sp_to_dp([0].fp[1 + IS_LE],
+ [0].dp[IS_LE]);
preempt_enable();
}
} else {
if (size == 16) {
unsigned long v = *(unsigned long *)(mem + 8);
-   reg->d[IS_BE] = !rev ? v : byterev_8(v);
+   reg[0].d[IS_BE] = !rev ? v : byterev_8(v);
} else if (op->vsx_flags & VSX_SPLAT)
-   reg->d[IS_BE] = reg->d[IS_LE];
+   reg[0].d[IS_BE] = reg[0].d[IS_LE];
}
break;
case 4:
@@ -768,13 +769,13 @@ void emulate_vsx_load(struct instruction_op *op, union 
vsx_reg *reg,
wp = mem;
for (j = 0; j < size / 4; ++j) {
i = IS_LE ? 3 - j : j;
-   reg->w[i] = !rev ? *wp++ : byterev_4(*wp++);
+   reg[0].w[i] = !rev ? *wp++ : byterev_4(*wp++);
}
if (op->vsx_flags & VSX_SPLAT) {
-   u32 val = reg->w[IS_LE ? 3 : 0];
+   u32 val = reg[0].w[IS_LE ? 3 : 0];
for (; j < 4; ++j) {
i = IS_LE ? 3 - j : j;
-   reg->w[i] = val;
+   reg[0].w[i] = val;
}
}
break;
@@ -783,7 +784,7 @@ void emulate_vsx_load(struct instruction_op *op, union 
vsx_reg *reg,
hp = mem;
for (j = 0; j < size / 2; ++j) {
i = IS_LE ? 7 - j : j;
-   reg->h[i] = !rev ? *hp++ : byterev_2(*hp++);
+   reg[0].h[i] = !rev ? *hp++ : byterev_2(*hp++);
}
break;
case 1:
@@ -791,7 +792,7 @@ void emulate_vsx_load(struct instruction_op *op, union 
vsx_reg *reg,
bp = mem;
for (j = 0; j < size; ++j) {
i = IS_LE ? 15 - j : j;
-   reg->b[i] = *bp++;
+   reg[0].b[i] = *bp++;
}
break;
}
@@ -804,7 +805,7 @@ void emulate_vsx_store(struct instruction_op *op, const 
union vsx_reg *reg,
 {
int size, write_size;
int i, j;
-   union vsx_reg buf;
+   union vsx_reg buf[2];
unsigned int *wp;
unsigned short *hp;
unsigned char *bp;
@@ -818,11 +819,11 @@ void emulate_vs

[PATCH v3 1/4] KVM: PPC: Allow nested guest creation when L0 hv_guest_state > L1

2020-12-16 Thread Ravi Bangoria
On powerpc, L1 hypervisor takes help of L0 using H_ENTER_NESTED
hcall to load L2 guest state in cpu. L1 hypervisor prepares the
L2 state in struct hv_guest_state and passes a pointer to it via
hcall. Using that pointer, L0 reads/writes that state directly
from/to L1 memory. Thus L0 must be aware of hv_guest_state layout
of L1. Currently it uses version field to achieve this. i.e. If
L0 hv_guest_state.version != L1 hv_guest_state.version, L0 won't
allow nested kvm guest.

This restriction can be loosen up a bit. L0 can be taught to
understand older layout of hv_guest_state, if we restrict the
new member to be added only at the end. i.e. we can allow
nested guest even when L0 hv_guest_state.version > L1
hv_guest_state.version. Though, the other way around is not
possible.

Signed-off-by: Ravi Bangoria 
Reviewed-by: Fabiano Rosas 
---
 arch/powerpc/include/asm/hvcall.h   | 17 +++--
 arch/powerpc/kvm/book3s_hv_nested.c | 55 +++--
 2 files changed, 60 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index c1fbccb04390..ca6840239f90 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -526,9 +526,12 @@ struct h_cpu_char_result {
u64 behaviour;
 };
 
-/* Register state for entering a nested guest with H_ENTER_NESTED */
+/*
+ * Register state for entering a nested guest with H_ENTER_NESTED.
+ * New member must be added at the end.
+ */
 struct hv_guest_state {
-   u64 version;/* version of this structure layout */
+   u64 version;/* version of this structure layout, must be 
first */
u32 lpid;
u32 vcpu_token;
/* These registers are hypervisor privileged (at least for writing) */
@@ -562,6 +565,16 @@ struct hv_guest_state {
 /* Latest version of hv_guest_state structure */
 #define HV_GUEST_STATE_VERSION 1
 
+static inline int hv_guest_state_size(unsigned int version)
+{
+   switch (version) {
+   case 1:
+   return offsetofend(struct hv_guest_state, ppr);
+   default:
+   return -1;
+   }
+}
+
 /*
  * From the document "H_GetPerformanceCounterInfo Interface" v1.07
  *
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c 
b/arch/powerpc/kvm/book3s_hv_nested.c
index 33b58549a9aa..937dd5114300 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -215,12 +215,51 @@ static void kvmhv_nested_mmio_needed(struct kvm_vcpu 
*vcpu, u64 regs_ptr)
}
 }
 
+static int kvmhv_read_guest_state_and_regs(struct kvm_vcpu *vcpu,
+  struct hv_guest_state *l2_hv,
+  struct pt_regs *l2_regs,
+  u64 hv_ptr, u64 regs_ptr)
+{
+   int size;
+
+   if (kvm_vcpu_read_guest(vcpu, hv_ptr, _hv->version,
+   sizeof(l2_hv->version)))
+   return -1;
+
+   if (kvmppc_need_byteswap(vcpu))
+   l2_hv->version = swab64(l2_hv->version);
+
+   size = hv_guest_state_size(l2_hv->version);
+   if (size < 0)
+   return -1;
+
+   return kvm_vcpu_read_guest(vcpu, hv_ptr, l2_hv, size) ||
+   kvm_vcpu_read_guest(vcpu, regs_ptr, l2_regs,
+   sizeof(struct pt_regs));
+}
+
+static int kvmhv_write_guest_state_and_regs(struct kvm_vcpu *vcpu,
+   struct hv_guest_state *l2_hv,
+   struct pt_regs *l2_regs,
+   u64 hv_ptr, u64 regs_ptr)
+{
+   int size;
+
+   size = hv_guest_state_size(l2_hv->version);
+   if (size < 0)
+   return -1;
+
+   return kvm_vcpu_write_guest(vcpu, hv_ptr, l2_hv, size) ||
+   kvm_vcpu_write_guest(vcpu, regs_ptr, l2_regs,
+sizeof(struct pt_regs));
+}
+
 long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
 {
long int err, r;
struct kvm_nested_guest *l2;
struct pt_regs l2_regs, saved_l1_regs;
-   struct hv_guest_state l2_hv, saved_l1_hv;
+   struct hv_guest_state l2_hv = {0}, saved_l1_hv;
struct kvmppc_vcore *vc = vcpu->arch.vcore;
u64 hv_ptr, regs_ptr;
u64 hdec_exp;
@@ -235,17 +274,15 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
hv_ptr = kvmppc_get_gpr(vcpu, 4);
regs_ptr = kvmppc_get_gpr(vcpu, 5);
vcpu->srcu_idx = srcu_read_lock(>kvm->srcu);
-   err = kvm_vcpu_read_guest(vcpu, hv_ptr, _hv,
- sizeof(struct hv_guest_state)) ||
-   kvm_vcpu_read_guest(vcpu, regs_ptr, _regs,
-   sizeof(struct pt_regs));
+   err = kvmhv_read_guest_state_and_regs(vcpu, _hv, _regs,
+ hv_ptr, reg

[PATCH v3 4/4] KVM: PPC: Introduce new capability for 2nd DAWR

2020-12-16 Thread Ravi Bangoria
Introduce KVM_CAP_PPC_DAWR1 which can be used by Qemu to query whether
kvm supports 2nd DAWR or not. The capability is by default disabled
even when the underlying CPU supports 2nd DAWR. Qemu needs to check
and enable it manually to use the feature.

Signed-off-by: Ravi Bangoria 
---
 Documentation/virt/kvm/api.rst | 10 ++
 arch/powerpc/include/asm/kvm_ppc.h |  1 +
 arch/powerpc/kvm/book3s_hv.c   | 12 
 arch/powerpc/kvm/powerpc.c | 10 ++
 include/uapi/linux/kvm.h   |  1 +
 tools/include/uapi/linux/kvm.h |  1 +
 6 files changed, 35 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index abb24575bdf9..049f07ebf197 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6016,6 +6016,16 @@ KVM_EXIT_X86_RDMSR and KVM_EXIT_X86_WRMSR exit 
notifications which user space
 can then handle to implement model specific MSR handling and/or user 
notifications
 to inform a user that an MSR was not handled.
 
+7.22 KVM_CAP_PPC_DAWR1
+--
+
+:Architectures: ppc
+:Parameters: none
+:Returns: 0 on success, -EINVAL when CPU doesn't support 2nd DAWR
+
+This capability can be used to check / enable 2nd DAWR feature provided
+by POWER10 processor.
+
 8. Other capabilities.
 ==
 
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 0a056c64c317..13c39d24dda5 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -314,6 +314,7 @@ struct kvmppc_ops {
  int size);
int (*enable_svm)(struct kvm *kvm);
int (*svm_off)(struct kvm *kvm);
+   int (*enable_dawr1)(struct kvm *kvm);
 };
 
 extern struct kvmppc_ops *kvmppc_hv_ops;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index b7a30c0692a7..04c02344bd3f 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -5625,6 +5625,17 @@ static int kvmhv_svm_off(struct kvm *kvm)
return ret;
 }
 
+static int kvmhv_enable_dawr1(struct kvm *kvm)
+{
+   if (!cpu_has_feature(CPU_FTR_DAWR1))
+   return -ENODEV;
+
+   /* kvm == NULL means the caller is testing if the capability exists */
+   if (kvm)
+   kvm->arch.dawr1_enabled = true;
+   return 0;
+}
+
 static struct kvmppc_ops kvm_ops_hv = {
.get_sregs = kvm_arch_vcpu_ioctl_get_sregs_hv,
.set_sregs = kvm_arch_vcpu_ioctl_set_sregs_hv,
@@ -5668,6 +5679,7 @@ static struct kvmppc_ops kvm_ops_hv = {
.store_to_eaddr = kvmhv_store_to_eaddr,
.enable_svm = kvmhv_enable_svm,
.svm_off = kvmhv_svm_off,
+   .enable_dawr1 = kvmhv_enable_dawr1,
 };
 
 static int kvm_init_subcore_bitmap(void)
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 13999123b735..380656528b5b 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -678,6 +678,10 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = hv_enabled && kvmppc_hv_ops->enable_svm &&
!kvmppc_hv_ops->enable_svm(NULL);
break;
+   case KVM_CAP_PPC_DAWR1:
+   r = !!(hv_enabled && kvmppc_hv_ops->enable_dawr1 &&
+  !kvmppc_hv_ops->enable_dawr1(NULL));
+   break;
 #endif
default:
r = 0;
@@ -2187,6 +2191,12 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
break;
r = kvm->arch.kvm_ops->enable_svm(kvm);
break;
+   case KVM_CAP_PPC_DAWR1:
+   r = -EINVAL;
+   if (!is_kvmppc_hv_enabled(kvm) || 
!kvm->arch.kvm_ops->enable_dawr1)
+   break;
+   r = kvm->arch.kvm_ops->enable_dawr1(kvm);
+   break;
 #endif
default:
r = -EINVAL;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index ca41220b40b8..f1210f99a52d 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1053,6 +1053,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_X86_USER_SPACE_MSR 188
 #define KVM_CAP_X86_MSR_FILTER 189
 #define KVM_CAP_ENFORCE_PV_FEATURE_CPUID 190
+#define KVM_CAP_PPC_DAWR1 191
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
index ca41220b40b8..f1210f99a52d 100644
--- a/tools/include/uapi/linux/kvm.h
+++ b/tools/include/uapi/linux/kvm.h
@@ -1053,6 +1053,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_X86_USER_SPACE_MSR 188
 #define KVM_CAP_X86_MSR_FILTER 189
 #define KVM_CAP_ENFORCE_PV_FEATURE_CPUID 190
+#define KVM_CAP_PPC_DAWR1 191
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.26.2



[PATCH v3 0/4] KVM: PPC: Power10 2nd DAWR enablement

2020-12-16 Thread Ravi Bangoria
Enable p10 2nd DAWR feature for Book3S kvm guest. DAWR is a hypervisor
resource and thus H_SET_MODE hcall is used to set/unset it. A new case
H_SET_MODE_RESOURCE_SET_DAWR1 is introduced in H_SET_MODE hcall for
setting/unsetting 2nd DAWR. Also, new capability KVM_CAP_PPC_DAWR1 has
been added to query 2nd DAWR support via kvm ioctl.

This feature also needs to be enabled in Qemu to really use it. I'll
post Qemu patches once kvm patches get accepted.

v2: 
https://lore.kernel.org/kvm/20201124105953.39325-1-ravi.bango...@linux.ibm.com

v2->v3:
 - Patch #1. If L0 version > L1, L0 hv_guest_state will contain some
   additional fields which won't be filled while reading from L1
   memory and thus they can contain garbage. Initialize l2_hv with 0s
   to avoid such situations.
 - Patch #3. Introduce per vm flag dawr1_enabled.
 - Patch #4. Instead of auto enabling KVM_CAP_PPC_DAWR1, let user check
   and enable it manually. Also move KVM_CAP_PPC_DAWR1 check / enable
   logic inside #if defined(CONFIG_KVM_BOOK3S_HV_POSSIBLE).
 - Explain KVM_CAP_PPC_DAWR1 in Documentation/virt/kvm/api.rst 
 - Rebased on top of 5.10-rc3.

v1->v2:
 - patch #1: New patch
 - patch #2: Don't rename KVM_REG_PPC_DAWR, it's an uapi macro
 - patch #3: Increment HV_GUEST_STATE_VERSION
 - Split kvm and selftests patches into different series
 - Patches rebased to paulus/kvm-ppc-next (cf59eb13e151) + few
   other watchpoint patches which are yet to be merged in
   paulus/kvm-ppc-next.

Ravi Bangoria (4):
  KVM: PPC: Allow nested guest creation when L0 hv_guest_state > L1
  KVM: PPC: Rename current DAWR macros and variables
  KVM: PPC: Add infrastructure to support 2nd DAWR
  KVM: PPC: Introduce new capability for 2nd DAWR

 Documentation/virt/kvm/api.rst| 12 
 arch/powerpc/include/asm/hvcall.h | 25 ++-
 arch/powerpc/include/asm/kvm_host.h   |  7 +-
 arch/powerpc/include/asm/kvm_ppc.h|  1 +
 arch/powerpc/include/uapi/asm/kvm.h   |  2 +
 arch/powerpc/kernel/asm-offsets.c |  6 +-
 arch/powerpc/kvm/book3s_hv.c  | 79 +++
 arch/powerpc/kvm/book3s_hv_nested.c   | 70 
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 43 +---
 arch/powerpc/kvm/powerpc.c| 10 +++
 include/uapi/linux/kvm.h  |  1 +
 tools/arch/powerpc/include/uapi/asm/kvm.h |  2 +
 tools/include/uapi/linux/kvm.h|  1 +
 13 files changed, 216 insertions(+), 43 deletions(-)

-- 
2.26.2



[PATCH v3 2/4] KVM: PPC: Rename current DAWR macros and variables

2020-12-16 Thread Ravi Bangoria
Power10 is introducing second DAWR. Use real register names (with
suffix 0) from ISA for current macros and variables used by kvm.
One exception is KVM_REG_PPC_DAWR. Keep it as it is because it's
uapi so changing it will break userspace.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/kvm_host.h |  4 ++--
 arch/powerpc/kernel/asm-offsets.c   |  4 ++--
 arch/powerpc/kvm/book3s_hv.c| 24 
 arch/powerpc/kvm/book3s_hv_nested.c |  8 
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 20 ++--
 5 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index d67a470e95a3..62cadf1a596e 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -584,8 +584,8 @@ struct kvm_vcpu_arch {
u32 ctrl;
u32 dabrx;
ulong dabr;
-   ulong dawr;
-   ulong dawrx;
+   ulong dawr0;
+   ulong dawrx0;
ulong ciabr;
ulong cfar;
ulong ppr;
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index c2722ff36e98..5a77aac516ba 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -548,8 +548,8 @@ int main(void)
OFFSET(VCPU_CTRL, kvm_vcpu, arch.ctrl);
OFFSET(VCPU_DABR, kvm_vcpu, arch.dabr);
OFFSET(VCPU_DABRX, kvm_vcpu, arch.dabrx);
-   OFFSET(VCPU_DAWR, kvm_vcpu, arch.dawr);
-   OFFSET(VCPU_DAWRX, kvm_vcpu, arch.dawrx);
+   OFFSET(VCPU_DAWR0, kvm_vcpu, arch.dawr0);
+   OFFSET(VCPU_DAWRX0, kvm_vcpu, arch.dawrx0);
OFFSET(VCPU_CIABR, kvm_vcpu, arch.ciabr);
OFFSET(VCPU_HFLAGS, kvm_vcpu, arch.hflags);
OFFSET(VCPU_DEC, kvm_vcpu, arch.dec);
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index e3b1839fc251..bcbad8daa974 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -782,8 +782,8 @@ static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, 
unsigned long mflags,
return H_UNSUPPORTED_FLAG_START;
if (value2 & DABRX_HYP)
return H_P4;
-   vcpu->arch.dawr  = value1;
-   vcpu->arch.dawrx = value2;
+   vcpu->arch.dawr0  = value1;
+   vcpu->arch.dawrx0 = value2;
return H_SUCCESS;
case H_SET_MODE_RESOURCE_ADDR_TRANS_MODE:
/* KVM does not support mflags=2 (AIL=2) */
@@ -1747,10 +1747,10 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, 
u64 id,
*val = get_reg_val(id, vcpu->arch.vcore->vtb);
break;
case KVM_REG_PPC_DAWR:
-   *val = get_reg_val(id, vcpu->arch.dawr);
+   *val = get_reg_val(id, vcpu->arch.dawr0);
break;
case KVM_REG_PPC_DAWRX:
-   *val = get_reg_val(id, vcpu->arch.dawrx);
+   *val = get_reg_val(id, vcpu->arch.dawrx0);
break;
case KVM_REG_PPC_CIABR:
*val = get_reg_val(id, vcpu->arch.ciabr);
@@ -1979,10 +1979,10 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, 
u64 id,
vcpu->arch.vcore->vtb = set_reg_val(id, *val);
break;
case KVM_REG_PPC_DAWR:
-   vcpu->arch.dawr = set_reg_val(id, *val);
+   vcpu->arch.dawr0 = set_reg_val(id, *val);
break;
case KVM_REG_PPC_DAWRX:
-   vcpu->arch.dawrx = set_reg_val(id, *val) & ~DAWRX_HYP;
+   vcpu->arch.dawrx0 = set_reg_val(id, *val) & ~DAWRX_HYP;
break;
case KVM_REG_PPC_CIABR:
vcpu->arch.ciabr = set_reg_val(id, *val);
@@ -3437,8 +3437,8 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu 
*vcpu, u64 time_limit,
int trap;
unsigned long host_hfscr = mfspr(SPRN_HFSCR);
unsigned long host_ciabr = mfspr(SPRN_CIABR);
-   unsigned long host_dawr = mfspr(SPRN_DAWR0);
-   unsigned long host_dawrx = mfspr(SPRN_DAWRX0);
+   unsigned long host_dawr0 = mfspr(SPRN_DAWR0);
+   unsigned long host_dawrx0 = mfspr(SPRN_DAWRX0);
unsigned long host_psscr = mfspr(SPRN_PSSCR);
unsigned long host_pidr = mfspr(SPRN_PID);
 
@@ -3477,8 +3477,8 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu 
*vcpu, u64 time_limit,
mtspr(SPRN_SPURR, vcpu->arch.spurr);
 
if (dawr_enabled()) {
-   mtspr(SPRN_DAWR0, vcpu->arch.dawr);
-   mtspr(SPRN_DAWRX0, vcpu->arch.dawrx);
+   mtspr(SPRN_DAWR0, vcpu->arch.dawr0);
+   mtspr(SPRN_DAWRX0, vcpu->arch.dawrx0);
}
mtspr(SPRN_CIABR, vcpu->arch.ciabr);
mtspr(SPRN_IC, vcpu->arch.ic);
@@ -3530,8 +3530,8 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu 
*vcpu, u64 

[PATCH v3 3/4] KVM: PPC: Add infrastructure to support 2nd DAWR

2020-12-16 Thread Ravi Bangoria
kvm code assumes single DAWR everywhere. Add code to support 2nd DAWR.
DAWR is a hypervisor resource and thus H_SET_MODE hcall is used to set/
unset it. Introduce new case H_SET_MODE_RESOURCE_SET_DAWR1 for 2nd DAWR.
Also, kvm will support 2nd DAWR only if CPU_FTR_DAWR1 is set.

Signed-off-by: Ravi Bangoria 
---
 Documentation/virt/kvm/api.rst|  2 ++
 arch/powerpc/include/asm/hvcall.h |  8 -
 arch/powerpc/include/asm/kvm_host.h   |  3 ++
 arch/powerpc/include/uapi/asm/kvm.h   |  2 ++
 arch/powerpc/kernel/asm-offsets.c |  2 ++
 arch/powerpc/kvm/book3s_hv.c  | 43 +++
 arch/powerpc/kvm/book3s_hv_nested.c   |  7 
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 23 
 tools/arch/powerpc/include/uapi/asm/kvm.h |  2 ++
 9 files changed, 91 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 36d5f1f3c6dd..abb24575bdf9 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -2249,6 +2249,8 @@ registers, find a list below:
   PPC KVM_REG_PPC_PSSCR   64
   PPC KVM_REG_PPC_DEC_EXPIRY  64
   PPC KVM_REG_PPC_PTCR64
+  PPC KVM_REG_PPC_DAWR1   64
+  PPC KVM_REG_PPC_DAWRX1  64
   PPC KVM_REG_PPC_TM_GPR0 64
   ...
   PPC KVM_REG_PPC_TM_GPR3164
diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index ca6840239f90..98afa58b619a 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -560,16 +560,22 @@ struct hv_guest_state {
u64 pidr;
u64 cfar;
u64 ppr;
+   /* Version 1 ends here */
+   u64 dawr1;
+   u64 dawrx1;
+   /* Version 2 ends here */
 };
 
 /* Latest version of hv_guest_state structure */
-#define HV_GUEST_STATE_VERSION 1
+#define HV_GUEST_STATE_VERSION 2
 
 static inline int hv_guest_state_size(unsigned int version)
 {
switch (version) {
case 1:
return offsetofend(struct hv_guest_state, ppr);
+   case 2:
+   return offsetofend(struct hv_guest_state, dawrx1);
default:
return -1;
}
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 62cadf1a596e..a93cfb672421 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -307,6 +307,7 @@ struct kvm_arch {
u8 svm_enabled;
bool threads_indep;
bool nested_enable;
+   bool dawr1_enabled;
pgd_t *pgtable;
u64 process_table;
struct dentry *debugfs_dir;
@@ -586,6 +587,8 @@ struct kvm_vcpu_arch {
ulong dabr;
ulong dawr0;
ulong dawrx0;
+   ulong dawr1;
+   ulong dawrx1;
ulong ciabr;
ulong cfar;
ulong ppr;
diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
b/arch/powerpc/include/uapi/asm/kvm.h
index c3af3f324c5a..9f18fa090f1f 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -644,6 +644,8 @@ struct kvm_ppc_cpu_char {
 #define KVM_REG_PPC_MMCR3  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc1)
 #define KVM_REG_PPC_SIER2  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc2)
 #define KVM_REG_PPC_SIER3  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc3)
+#define KVM_REG_PPC_DAWR1  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc4)
+#define KVM_REG_PPC_DAWRX1 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc5)
 
 /* Transactional Memory checkpointed state:
  * This is all GPRs, all VSX regs and a subset of SPRs
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 5a77aac516ba..a35ea4e19360 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -550,6 +550,8 @@ int main(void)
OFFSET(VCPU_DABRX, kvm_vcpu, arch.dabrx);
OFFSET(VCPU_DAWR0, kvm_vcpu, arch.dawr0);
OFFSET(VCPU_DAWRX0, kvm_vcpu, arch.dawrx0);
+   OFFSET(VCPU_DAWR1, kvm_vcpu, arch.dawr1);
+   OFFSET(VCPU_DAWRX1, kvm_vcpu, arch.dawrx1);
OFFSET(VCPU_CIABR, kvm_vcpu, arch.ciabr);
OFFSET(VCPU_HFLAGS, kvm_vcpu, arch.hflags);
OFFSET(VCPU_DEC, kvm_vcpu, arch.dec);
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index bcbad8daa974..b7a30c0692a7 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -785,6 +785,22 @@ static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, 
unsigned long mflags,
vcpu->arch.dawr0  = value1;
vcpu->arch.dawrx0 = value2;
return H_SUCCESS;
+   case H_SET_MODE_RESOURCE_SET_DAWR1:
+   if (!kvmppc_power8_compatible(vcpu))
+   return H_P2;
+   if (!ppc_breakpoint_available())
+   return H_P2;
+   if (!cpu_has_feature(CPU_FTR_DAWR1))
+   

[PATCH] powerpc/xmon: Fix build failure for 8xx

2020-11-29 Thread Ravi Bangoria
With CONFIG_PPC_8xx and CONFIG_XMON set, kernel build fails with

  arch/powerpc/xmon/xmon.c:1379:12: error: 'find_free_data_bpt' defined
  but not used [-Werror=unused-function]

Fix it by enclosing find_free_data_bpt() inside #ifndef CONFIG_PPC_8xx.

Reported-by: kernel test robot 
Fixes: 30df74d67d48 ("powerpc/watchpoint/xmon: Support 2nd DAWR")
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/xmon/xmon.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 55c43a6c9111..5559edf36756 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -1383,6 +1383,7 @@ static long check_bp_loc(unsigned long addr)
return 1;
 }
 
+#ifndef CONFIG_PPC_8xx
 static int find_free_data_bpt(void)
 {
int i;
@@ -1394,6 +1395,7 @@ static int find_free_data_bpt(void)
printf("Couldn't find free breakpoint register\n");
return -1;
 }
+#endif
 
 static void print_data_bpts(void)
 {
-- 
2.17.1



Re: [PATCH 1/2] powerpc: sstep: Fix load and update instructions

2020-11-26 Thread Ravi Bangoria




On 11/19/20 11:11 AM, Sandipan Das wrote:

The Power ISA says that the fixed-point load and update
instructions must neither use R0 for the base address (RA)
nor have the destination (RT) and the base address (RA) as
the same register. In these cases, the instruction is
invalid. This applies to the following instructions.
   * Load Byte and Zero with Update (lbzu)
   * Load Byte and Zero with Update Indexed (lbzux)
   * Load Halfword and Zero with Update (lhzu)
   * Load Halfword and Zero with Update Indexed (lhzux)
   * Load Halfword Algebraic with Update (lhau)
   * Load Halfword Algebraic with Update Indexed (lhaux)
   * Load Word and Zero with Update (lwzu)
   * Load Word and Zero with Update Indexed (lwzux)
   * Load Word Algebraic with Update Indexed (lwaux)
   * Load Doubleword with Update (ldu)
   * Load Doubleword with Update Indexed (ldux)

However, the following behaviour is observed using some
invalid opcodes where RA = RT.

An userspace program using an invalid instruction word like
0xe9ce0001, i.e. "ldu r14, 0(r14)", runs and exits without
getting terminated abruptly. The instruction performs the
load operation but does not write the effective address to
the base address register. Attaching an uprobe at that
instruction's address results in emulation which writes the
effective address to the base register. Thus, the final value
of the base address register is different.

To remove any inconsistencies, this adds an additional check
for the aforementioned instructions to make sure that they
are treated as unknown by the emulation infrastructure when
RA = 0 or RA = RT. The kernel will then fallback to executing
the instruction on hardware.

Signed-off-by: Sandipan Das 


For the series:
Reviewed-by: Ravi Bangoria 


Re: [PATCH 1/2] powerpc: sstep: Fix load and update instructions

2020-11-25 Thread Ravi Bangoria




diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index 855457ed09b5..25a5436be6c6 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -2157,11 +2157,15 @@ int analyse_instr(struct instruction_op *op, const 
struct pt_regs *regs,
  
  		case 23:	/* lwzx */

case 55:/* lwzux */
+   if (u && (ra == 0 || ra == rd))
+   return -1;


I guess you also need to split case 23 and 55?

- Ravi


[PATCH v2 4/4] KVM: PPC: Introduce new capability for 2nd DAWR

2020-11-24 Thread Ravi Bangoria
Introduce KVM_CAP_PPC_DAWR1 which can be used by Qemu to query whether
kvm supports 2nd DAWR or not.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/kvm/powerpc.c | 3 +++
 include/uapi/linux/kvm.h   | 1 +
 2 files changed, 4 insertions(+)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 13999123b735..48763fe59fc5 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -679,6 +679,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
!kvmppc_hv_ops->enable_svm(NULL);
break;
 #endif
+   case KVM_CAP_PPC_DAWR1:
+   r = cpu_has_feature(CPU_FTR_DAWR1);
+   break;
default:
r = 0;
break;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index f6d86033c4fa..0f32d6cbabc2 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1035,6 +1035,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_LAST_CPU 184
 #define KVM_CAP_SMALLER_MAXPHYADDR 185
 #define KVM_CAP_S390_DIAG318 186
+#define KVM_CAP_PPC_DAWR1 187
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.26.2



[PATCH v2 0/4] KVM: PPC: Power10 2nd DAWR enablement

2020-11-24 Thread Ravi Bangoria
Enable p10 2nd DAWR feature for Book3S kvm guest. DAWR is a hypervisor
resource and thus H_SET_MODE hcall is used to set/unset it. A new case
H_SET_MODE_RESOURCE_SET_DAWR1 is introduced in H_SET_MODE hcall for
setting/unsetting 2nd DAWR. Also, new capability KVM_CAP_PPC_DAWR1 has
been added to query 2nd DAWR support via kvm ioctl.

This feature also needs to be enabled in Qemu to really use it. I'll
post Qemu patches once kvm patches get accepted.

v1: 
https://lore.kernel.org/r/20200723102058.312282-1-ravi.bango...@linux.ibm.com

v1->v2:
 - patch #1: New patch
 - patch #2: Don't rename KVM_REG_PPC_DAWR, it's an uapi macro
 - patch #3: Increment HV_GUEST_STATE_VERSION
 - Split kvm and selftests patches into different series
 - Patches rebased to paulus/kvm-ppc-next (cf59eb13e151) + few
   other watchpoint patches which are yet to be merged in
   paulus/kvm-ppc-next.

Ravi Bangoria (4):
  KVM: PPC: Allow nested guest creation when L0 hv_guest_state > L1
  KVM: PPC: Rename current DAWR macros and variables
  KVM: PPC: Add infrastructure to support 2nd DAWR
  KVM: PPC: Introduce new capability for 2nd DAWR

 Documentation/virt/kvm/api.rst|  2 +
 arch/powerpc/include/asm/hvcall.h | 25 -
 arch/powerpc/include/asm/kvm_host.h   |  6 +-
 arch/powerpc/include/uapi/asm/kvm.h   |  2 +
 arch/powerpc/kernel/asm-offsets.c |  6 +-
 arch/powerpc/kvm/book3s_hv.c  | 65 ++
 arch/powerpc/kvm/book3s_hv_nested.c   | 68 ++-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 43 ++
 arch/powerpc/kvm/powerpc.c|  3 +
 include/uapi/linux/kvm.h  |  1 +
 tools/arch/powerpc/include/uapi/asm/kvm.h |  2 +
 11 files changed, 181 insertions(+), 42 deletions(-)

-- 
2.26.2



[PATCH v2 1/4] KVM: PPC: Allow nested guest creation when L0 hv_guest_state > L1

2020-11-24 Thread Ravi Bangoria
On powerpc, L1 hypervisor takes help of L0 using H_ENTER_NESTED
hcall to load L2 guest state in cpu. L1 hypervisor prepares the
L2 state in struct hv_guest_state and passes a pointer to it via
hcall. Using that pointer, L0 reads/writes that state directly
from/to L1 memory. Thus L0 must be aware of hv_guest_state layout
of L1. Currently it uses version field to achieve this. i.e. If
L0 hv_guest_state.version != L1 hv_guest_state.version, L0 won't
allow nested kvm guest.

This restriction can be loosen up a bit. L0 can be taught to
understand older layout of hv_guest_state, if we restrict the
new member to be added only at the end. i.e. we can allow
nested guest even when L0 hv_guest_state.version > L1
hv_guest_state.version. Though, the other way around is not
possible.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/hvcall.h   | 17 +++--
 arch/powerpc/kvm/book3s_hv_nested.c | 53 -
 2 files changed, 59 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index fbb377055471..a7073fddb657 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -524,9 +524,12 @@ struct h_cpu_char_result {
u64 behaviour;
 };
 
-/* Register state for entering a nested guest with H_ENTER_NESTED */
+/*
+ * Register state for entering a nested guest with H_ENTER_NESTED.
+ * New member must be added at the end.
+ */
 struct hv_guest_state {
-   u64 version;/* version of this structure layout */
+   u64 version;/* version of this structure layout, must be 
first */
u32 lpid;
u32 vcpu_token;
/* These registers are hypervisor privileged (at least for writing) */
@@ -560,6 +563,16 @@ struct hv_guest_state {
 /* Latest version of hv_guest_state structure */
 #define HV_GUEST_STATE_VERSION 1
 
+static inline int hv_guest_state_size(unsigned int version)
+{
+   switch (version) {
+   case 1:
+   return offsetofend(struct hv_guest_state, ppr);
+   default:
+   return -1;
+   }
+}
+
 #endif /* __ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_HVCALL_H */
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c 
b/arch/powerpc/kvm/book3s_hv_nested.c
index 33b58549a9aa..2b433c3bacea 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -215,6 +215,45 @@ static void kvmhv_nested_mmio_needed(struct kvm_vcpu 
*vcpu, u64 regs_ptr)
}
 }
 
+static int kvmhv_read_guest_state_and_regs(struct kvm_vcpu *vcpu,
+  struct hv_guest_state *l2_hv,
+  struct pt_regs *l2_regs,
+  u64 hv_ptr, u64 regs_ptr)
+{
+   int size;
+
+   if (kvm_vcpu_read_guest(vcpu, hv_ptr, &(l2_hv->version),
+   sizeof(l2_hv->version)))
+   return -1;
+
+   if (kvmppc_need_byteswap(vcpu))
+   l2_hv->version = swab64(l2_hv->version);
+
+   size = hv_guest_state_size(l2_hv->version);
+   if (size < 0)
+   return -1;
+
+   return kvm_vcpu_read_guest(vcpu, hv_ptr, l2_hv, size) ||
+   kvm_vcpu_read_guest(vcpu, regs_ptr, l2_regs,
+   sizeof(struct pt_regs));
+}
+
+static int kvmhv_write_guest_state_and_regs(struct kvm_vcpu *vcpu,
+   struct hv_guest_state *l2_hv,
+   struct pt_regs *l2_regs,
+   u64 hv_ptr, u64 regs_ptr)
+{
+   int size;
+
+   size = hv_guest_state_size(l2_hv->version);
+   if (size < 0)
+   return -1;
+
+   return kvm_vcpu_write_guest(vcpu, hv_ptr, l2_hv, size) ||
+   kvm_vcpu_write_guest(vcpu, regs_ptr, l2_regs,
+sizeof(struct pt_regs));
+}
+
 long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
 {
long int err, r;
@@ -235,17 +274,15 @@ long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu)
hv_ptr = kvmppc_get_gpr(vcpu, 4);
regs_ptr = kvmppc_get_gpr(vcpu, 5);
vcpu->srcu_idx = srcu_read_lock(>kvm->srcu);
-   err = kvm_vcpu_read_guest(vcpu, hv_ptr, _hv,
- sizeof(struct hv_guest_state)) ||
-   kvm_vcpu_read_guest(vcpu, regs_ptr, _regs,
-   sizeof(struct pt_regs));
+   err = kvmhv_read_guest_state_and_regs(vcpu, _hv, _regs,
+ hv_ptr, regs_ptr);
srcu_read_unlock(>kvm->srcu, vcpu->srcu_idx);
if (err)
return H_PARAMETER;
 
if (kvmppc_need_byteswap(vcpu))
byteswap_hv_regs(_hv);
-   if (l2_hv.version != HV_GUEST_STATE_VERSION)
+   if (l2_hv.version >

[PATCH v2 3/4] KVM: PPC: Add infrastructure to support 2nd DAWR

2020-11-24 Thread Ravi Bangoria
kvm code assumes single DAWR everywhere. Add code to support 2nd DAWR.
DAWR is a hypervisor resource and thus H_SET_MODE hcall is used to set/
unset it. Introduce new case H_SET_MODE_RESOURCE_SET_DAWR1 for 2nd DAWR.
Also, kvm will support 2nd DAWR only if CPU_FTR_DAWR1 is set.

Signed-off-by: Ravi Bangoria 
---
 Documentation/virt/kvm/api.rst|  2 ++
 arch/powerpc/include/asm/hvcall.h |  8 -
 arch/powerpc/include/asm/kvm_host.h   |  2 ++
 arch/powerpc/include/uapi/asm/kvm.h   |  2 ++
 arch/powerpc/kernel/asm-offsets.c |  2 ++
 arch/powerpc/kvm/book3s_hv.c  | 41 +++
 arch/powerpc/kvm/book3s_hv_nested.c   |  7 
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 23 +
 tools/arch/powerpc/include/uapi/asm/kvm.h |  2 ++
 9 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index eb3a1316f03e..72c98735aa52 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -2249,6 +2249,8 @@ registers, find a list below:
   PPC KVM_REG_PPC_PSSCR   64
   PPC KVM_REG_PPC_DEC_EXPIRY  64
   PPC KVM_REG_PPC_PTCR64
+  PPC KVM_REG_PPC_DAWR1   64
+  PPC KVM_REG_PPC_DAWRX1  64
   PPC KVM_REG_PPC_TM_GPR0 64
   ...
   PPC KVM_REG_PPC_TM_GPR3164
diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index a7073fddb657..4bacd27a348b 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -558,16 +558,22 @@ struct hv_guest_state {
u64 pidr;
u64 cfar;
u64 ppr;
+   /* Version 1 ends here */
+   u64 dawr1;
+   u64 dawrx1;
+   /* Version 2 ends here */
 };
 
 /* Latest version of hv_guest_state structure */
-#define HV_GUEST_STATE_VERSION 1
+#define HV_GUEST_STATE_VERSION 2
 
 static inline int hv_guest_state_size(unsigned int version)
 {
switch (version) {
case 1:
return offsetofend(struct hv_guest_state, ppr);
+   case 2:
+   return offsetofend(struct hv_guest_state, dawrx1);
default:
return -1;
}
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 62cadf1a596e..9804afdf8578 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -586,6 +586,8 @@ struct kvm_vcpu_arch {
ulong dabr;
ulong dawr0;
ulong dawrx0;
+   ulong dawr1;
+   ulong dawrx1;
ulong ciabr;
ulong cfar;
ulong ppr;
diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
b/arch/powerpc/include/uapi/asm/kvm.h
index c3af3f324c5a..9f18fa090f1f 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -644,6 +644,8 @@ struct kvm_ppc_cpu_char {
 #define KVM_REG_PPC_MMCR3  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc1)
 #define KVM_REG_PPC_SIER2  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc2)
 #define KVM_REG_PPC_SIER3  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc3)
+#define KVM_REG_PPC_DAWR1  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc4)
+#define KVM_REG_PPC_DAWRX1 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc5)
 
 /* Transactional Memory checkpointed state:
  * This is all GPRs, all VSX regs and a subset of SPRs
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index e4256f5b4602..26d4fa8fe51e 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -549,6 +549,8 @@ int main(void)
OFFSET(VCPU_DABRX, kvm_vcpu, arch.dabrx);
OFFSET(VCPU_DAWR0, kvm_vcpu, arch.dawr0);
OFFSET(VCPU_DAWRX0, kvm_vcpu, arch.dawrx0);
+   OFFSET(VCPU_DAWR1, kvm_vcpu, arch.dawr1);
+   OFFSET(VCPU_DAWRX1, kvm_vcpu, arch.dawrx1);
OFFSET(VCPU_CIABR, kvm_vcpu, arch.ciabr);
OFFSET(VCPU_HFLAGS, kvm_vcpu, arch.hflags);
OFFSET(VCPU_DEC, kvm_vcpu, arch.dec);
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index d5c6efc8a76e..2ff645789e9e 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -785,6 +785,20 @@ static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, 
unsigned long mflags,
vcpu->arch.dawr0  = value1;
vcpu->arch.dawrx0 = value2;
return H_SUCCESS;
+   case H_SET_MODE_RESOURCE_SET_DAWR1:
+   if (!kvmppc_power8_compatible(vcpu))
+   return H_P2;
+   if (!ppc_breakpoint_available())
+   return H_P2;
+   if (!cpu_has_feature(CPU_FTR_DAWR1))
+   return H_P2;
+   if (mflags)
+   return H_UNSUPPORTED_FLAG_START;
+   if (value2 & DABRX_HYP)
+   return H_P4;
+   vcpu->arch.dawr1  = value1;
+  

[PATCH v2 2/4] KVM: PPC: Rename current DAWR macros and variables

2020-11-24 Thread Ravi Bangoria
Power10 is introducing second DAWR. Use real register names (with
suffix 0) from ISA for current macros and variables used by kvm.
One exception is KVM_REG_PPC_DAWR. Keep it as it is because it's
uapi so changing it will break userspace.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/kvm_host.h |  4 ++--
 arch/powerpc/kernel/asm-offsets.c   |  4 ++--
 arch/powerpc/kvm/book3s_hv.c| 24 
 arch/powerpc/kvm/book3s_hv_nested.c |  8 
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 20 ++--
 5 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index d67a470e95a3..62cadf1a596e 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -584,8 +584,8 @@ struct kvm_vcpu_arch {
u32 ctrl;
u32 dabrx;
ulong dabr;
-   ulong dawr;
-   ulong dawrx;
+   ulong dawr0;
+   ulong dawrx0;
ulong ciabr;
ulong cfar;
ulong ppr;
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 8711c2164b45..e4256f5b4602 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -547,8 +547,8 @@ int main(void)
OFFSET(VCPU_CTRL, kvm_vcpu, arch.ctrl);
OFFSET(VCPU_DABR, kvm_vcpu, arch.dabr);
OFFSET(VCPU_DABRX, kvm_vcpu, arch.dabrx);
-   OFFSET(VCPU_DAWR, kvm_vcpu, arch.dawr);
-   OFFSET(VCPU_DAWRX, kvm_vcpu, arch.dawrx);
+   OFFSET(VCPU_DAWR0, kvm_vcpu, arch.dawr0);
+   OFFSET(VCPU_DAWRX0, kvm_vcpu, arch.dawrx0);
OFFSET(VCPU_CIABR, kvm_vcpu, arch.ciabr);
OFFSET(VCPU_HFLAGS, kvm_vcpu, arch.hflags);
OFFSET(VCPU_DEC, kvm_vcpu, arch.dec);
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 490a0f6a7285..d5c6efc8a76e 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -782,8 +782,8 @@ static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, 
unsigned long mflags,
return H_UNSUPPORTED_FLAG_START;
if (value2 & DABRX_HYP)
return H_P4;
-   vcpu->arch.dawr  = value1;
-   vcpu->arch.dawrx = value2;
+   vcpu->arch.dawr0  = value1;
+   vcpu->arch.dawrx0 = value2;
return H_SUCCESS;
case H_SET_MODE_RESOURCE_ADDR_TRANS_MODE:
/* KVM does not support mflags=2 (AIL=2) */
@@ -1747,10 +1747,10 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, 
u64 id,
*val = get_reg_val(id, vcpu->arch.vcore->vtb);
break;
case KVM_REG_PPC_DAWR:
-   *val = get_reg_val(id, vcpu->arch.dawr);
+   *val = get_reg_val(id, vcpu->arch.dawr0);
break;
case KVM_REG_PPC_DAWRX:
-   *val = get_reg_val(id, vcpu->arch.dawrx);
+   *val = get_reg_val(id, vcpu->arch.dawrx0);
break;
case KVM_REG_PPC_CIABR:
*val = get_reg_val(id, vcpu->arch.ciabr);
@@ -1979,10 +1979,10 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, 
u64 id,
vcpu->arch.vcore->vtb = set_reg_val(id, *val);
break;
case KVM_REG_PPC_DAWR:
-   vcpu->arch.dawr = set_reg_val(id, *val);
+   vcpu->arch.dawr0 = set_reg_val(id, *val);
break;
case KVM_REG_PPC_DAWRX:
-   vcpu->arch.dawrx = set_reg_val(id, *val) & ~DAWRX_HYP;
+   vcpu->arch.dawrx0 = set_reg_val(id, *val) & ~DAWRX_HYP;
break;
case KVM_REG_PPC_CIABR:
vcpu->arch.ciabr = set_reg_val(id, *val);
@@ -3437,8 +3437,8 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu 
*vcpu, u64 time_limit,
int trap;
unsigned long host_hfscr = mfspr(SPRN_HFSCR);
unsigned long host_ciabr = mfspr(SPRN_CIABR);
-   unsigned long host_dawr = mfspr(SPRN_DAWR0);
-   unsigned long host_dawrx = mfspr(SPRN_DAWRX0);
+   unsigned long host_dawr0 = mfspr(SPRN_DAWR0);
+   unsigned long host_dawrx0 = mfspr(SPRN_DAWRX0);
unsigned long host_psscr = mfspr(SPRN_PSSCR);
unsigned long host_pidr = mfspr(SPRN_PID);
 
@@ -3477,8 +3477,8 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu 
*vcpu, u64 time_limit,
mtspr(SPRN_SPURR, vcpu->arch.spurr);
 
if (dawr_enabled()) {
-   mtspr(SPRN_DAWR0, vcpu->arch.dawr);
-   mtspr(SPRN_DAWRX0, vcpu->arch.dawrx);
+   mtspr(SPRN_DAWR0, vcpu->arch.dawr0);
+   mtspr(SPRN_DAWRX0, vcpu->arch.dawrx0);
}
mtspr(SPRN_CIABR, vcpu->arch.ciabr);
mtspr(SPRN_IC, vcpu->arch.ic);
@@ -3530,8 +3530,8 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu 
*vcpu, u64 

[PATCH v3] powerpc/watchpoint: Workaround P10 DD1 issue with VSX-32 byte instructions

2020-11-05 Thread Ravi Bangoria
POWER10 DD1 has an issue where it generates watchpoint exceptions when
it shouldn't. The conditions where this occur are:

 - octword op
 - ending address of DAWR range is less than starting address of op
 - those addresses need to be in the same or in two consecutive 512B
   blocks
 - 'op address + 64B' generates an address that has a carry into bit
   52 (crosses 2K boundary)

Handle such spurious exception by considering them as extraneous and
emulating/single-steeping instruction without generating an event.

Signed-off-by: Ravi Bangoria 
[Fixed build warning reported by kernel test robot]
Reported-by: kernel test robot 
---
v2->v3:
 - v2: 
https://lore.kernel.org/r/20201022034039.330365-1-ravi.bango...@linux.ibm.com
 - Drop first patch which introduced CPU_FTRS_POWER10_DD1. Instead use
   P1 DD1 PVR direclty in if condition. We can't set CPU_FTRS_POWER10_DD1
   inside guest as guest can be migrated to futur version of cpu.

Dependency: VSX-32 byte emulation support patches
  https://lore.kernel.org/r/20201011050908.72173-1-ravi.bango...@linux.ibm.com

 arch/powerpc/kernel/hw_breakpoint.c | 68 -
 1 file changed, 66 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
b/arch/powerpc/kernel/hw_breakpoint.c
index 1f4a1efa0074..67297aea5d94 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -644,6 +644,11 @@ static bool is_larx_stcx_instr(int type)
return type == LARX || type == STCX;
 }
 
+static bool is_octword_vsx_instr(int type, int size)
+{
+   return ((type == LOAD_VSX || type == STORE_VSX) && size == 32);
+}
+
 /*
  * We've failed in reliably handling the hw-breakpoint. Unregister
  * it and throw a warning message to let the user know about it.
@@ -694,6 +699,58 @@ static bool stepping_handler(struct pt_regs *regs, struct 
perf_event **bp,
return true;
 }
 
+static void handle_p10dd1_spurious_exception(struct arch_hw_breakpoint **info,
+int *hit, unsigned long ea)
+{
+   int i;
+   unsigned long hw_end_addr;
+
+   /*
+* Handle spurious exception only when any bp_per_reg is set.
+* Otherwise this might be created by xmon and not actually a
+* spurious exception.
+*/
+   for (i = 0; i < nr_wp_slots(); i++) {
+   if (!info[i])
+   continue;
+
+   hw_end_addr = ALIGN(info[i]->address + info[i]->len, 
HW_BREAKPOINT_SIZE);
+
+   /*
+* Ending address of DAWR range is less than starting
+* address of op.
+*/
+   if ((hw_end_addr - 1) >= ea)
+   continue;
+
+   /*
+* Those addresses need to be in the same or in two
+* consecutive 512B blocks;
+*/
+   if (((hw_end_addr - 1) >> 10) != (ea >> 10))
+   continue;
+
+   /*
+* 'op address + 64B' generates an address that has a
+* carry into bit 52 (crosses 2K boundary).
+*/
+   if ((ea & 0x800) == ((ea + 64) & 0x800))
+   continue;
+
+   break;
+   }
+
+   if (i == nr_wp_slots())
+   return;
+
+   for (i = 0; i < nr_wp_slots(); i++) {
+   if (info[i]) {
+   hit[i] = 1;
+   info[i]->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
+   }
+   }
+}
+
 int hw_breakpoint_handler(struct die_args *args)
 {
bool err = false;
@@ -752,8 +809,15 @@ int hw_breakpoint_handler(struct die_args *args)
goto reset;
 
if (!nr_hit) {
-   rc = NOTIFY_DONE;
-   goto out;
+   /* Workaround for Power10 DD1 */
+   if (mfspr(SPRN_PVR) == 0x800100 &&
+   !IS_ENABLED(CONFIG_PPC_8xx) &&
+   is_octword_vsx_instr(type, size)) {
+   handle_p10dd1_spurious_exception(info, hit, ea);
+   } else {
+   rc = NOTIFY_DONE;
+   goto out;
+   }
}
 
/*
-- 
2.25.1



Re: [PATCH 1/2] powerpc: Introduce POWER10_DD1 feature

2020-10-26 Thread Ravi Bangoria

Hi Michael,


+static void __init fixup_cpu_features(void)
+{
+   unsigned long version = mfspr(SPRN_PVR);
+
+   if ((version & 0x) == 0x00800100)
+   cur_cpu_spec->cpu_features |= CPU_FTR_POWER10_DD1;
+}
+
  static int __init early_init_dt_scan_cpus(unsigned long node,
  const char *uname, int depth,
  void *data)
@@ -378,6 +386,7 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,
  
  		check_cpu_feature_properties(node);

check_cpu_pa_features(node);
+   fixup_cpu_features();
}


This is not the way we normally do CPU features.

In the past we have always added a raw entry in cputable.c, see eg. the
Power9 DD 2.0, 2.1 entries.


True. But that won't work PowerVM guests because kernel overwrites "raw"
mode cpu_features with "architected" mode cpu_features.



Doing it here is not really safe, if you're running with an architected
PVR (via cpu-version property), you can't set the DD1 feature, because
you might be migrated to a future CPU that doesn't have the DD1 quirks.


Okay.. I suppose, that mean kernel need to check real PVR every time when it
encounters spurious exception. i.e. instead of using

  if (cpu_has_feature(CPU_FTR_POWER10_DD1) && ...)

it need to use

  if (mfspr(SPRN_PVR) == 0x800100 && ...)

in patch 2. Right?

Thanks,
Ravi


Re: [PATCH v2 1/2] powerpc: Introduce POWER10_DD1 feature

2020-10-23 Thread Ravi Bangoria




+static void __init fixup_cpu_features(void)
+{
+   unsigned long version = mfspr(SPRN_PVR);
+
+   if ((version & 0x) == 0x00800100)
+   cur_cpu_spec->cpu_features |= CPU_FTR_POWER10_DD1;
+}
+

I am just wondering why this is needed here, but the same thing is not
done for, say, CPU_FTR_POWER9_DD2_1?


When we don't use DT cpu_features (PowerVM / kvm geusts), we call
identify_cpu() twice. First with Real PVR which sets "raw" cpu_spec
as cur_cpu_spec and then 2nd time with Logical PVR (0x0f...) which
(mostly) overwrites the cur_cpu_spec with "architected" mode cpu_spec.
I don't see DD version specific entries for "architected" mode in
cpu_specs[] for any previous processors. So I've introduced this
function to tweak cpu_features.

Though, I don't know why we don't have similar thing for
CPU_FTR_POWER9_DD2_1. I've to check that.


And should we get a /* Power10 DD 1 */ added to cpu_specs[]?


IIUC, we don't need such entry. For PowerVM / kvm guests, we overwrite
cpu_spec, so /* Power10 */ "raw" entry is sufficient. And For baremetal,
we don't use cpu_specs[] at all.

I think even for powernv, using dt features can be disabled by the
cmdline with dt_cpu_ftrs=off, then cpu_specs[] will then be used.


Ok... with dt_cpu_ftrs=off, we seem to be using raw mode cpu_specs[] entry on
baremetal. So yeah, I'll add /* Power10 DD1 */ raw mode entry into cpu_specs[].
Thanks for pointing it out.

-Ravi


Re: [PATCH v2 1/2] powerpc: Introduce POWER10_DD1 feature

2020-10-21 Thread Ravi Bangoria




On 10/22/20 10:41 AM, Jordan Niethe wrote:

On Thu, Oct 22, 2020 at 2:40 PM Ravi Bangoria
 wrote:


POWER10_DD1 feature flag will be needed while adding
conditional code that applies only for Power10 DD1.

Signed-off-by: Ravi Bangoria 
---
  arch/powerpc/include/asm/cputable.h | 8 ++--
  arch/powerpc/kernel/dt_cpu_ftrs.c   | 3 +++
  arch/powerpc/kernel/prom.c  | 9 +
  3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index 93bc70d4c9a1..d486f56c0d33 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -216,6 +216,7 @@ static inline void cpu_feature_keys_init(void) { }
  #define CPU_FTR_P9_RADIX_PREFETCH_BUG  LONG_ASM_CONST(0x0002)
  #define CPU_FTR_ARCH_31
LONG_ASM_CONST(0x0004)
  #define CPU_FTR_DAWR1  LONG_ASM_CONST(0x0008)
+#define CPU_FTR_POWER10_DD1LONG_ASM_CONST(0x0010)

  #ifndef __ASSEMBLY__

@@ -479,6 +480,7 @@ static inline void cpu_feature_keys_init(void) { }
 CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
 CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_ARCH_31 | \
 CPU_FTR_DAWR | CPU_FTR_DAWR1)
+#define CPU_FTRS_POWER10_DD1   (CPU_FTRS_POWER10 | CPU_FTR_POWER10_DD1)
  #define CPU_FTRS_CELL  (CPU_FTR_LWSYNC | \
 CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -497,14 +499,16 @@ static inline void cpu_feature_keys_init(void) { }
  #define CPU_FTRS_POSSIBLE  \
 (CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | CPU_FTRS_POWER8 | \
  CPU_FTR_ALTIVEC_COMP | CPU_FTR_VSX_COMP | CPU_FTRS_POWER9 | \
-CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
+CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | 
\
+CPU_FTRS_POWER10_DD1)
  #else
  #define CPU_FTRS_POSSIBLE  \
 (CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | \
  CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | \
  CPU_FTRS_POWER8 | CPU_FTRS_CELL | CPU_FTRS_PA6T | \
  CPU_FTR_VSX_COMP | CPU_FTR_ALTIVEC_COMP | CPU_FTRS_POWER9 | \
-CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
+CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | 
\
+CPU_FTRS_POWER10_DD1)
  #endif /* CONFIG_CPU_LITTLE_ENDIAN */
  #endif
  #else
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c 
b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 1098863e17ee..b2327f2967ff 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -811,6 +811,9 @@ static __init void cpufeatures_cpu_quirks(void)
 }

 update_tlbie_feature_flag(version);
+
+   if ((version & 0x) == 0x00800100)
+   cur_cpu_spec->cpu_features |= CPU_FTR_POWER10_DD1;
  }

  static void __init cpufeatures_setup_finished(void)
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index c1545f22c077..c778c81284f7 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -305,6 +305,14 @@ static void __init check_cpu_feature_properties(unsigned 
long node)
 }
  }

+static void __init fixup_cpu_features(void)
+{
+   unsigned long version = mfspr(SPRN_PVR);
+
+   if ((version & 0x) == 0x00800100)
+   cur_cpu_spec->cpu_features |= CPU_FTR_POWER10_DD1;
+}
+

I am just wondering why this is needed here, but the same thing is not
done for, say, CPU_FTR_POWER9_DD2_1?


When we don't use DT cpu_features (PowerVM / kvm geusts), we call
identify_cpu() twice. First with Real PVR which sets "raw" cpu_spec
as cur_cpu_spec and then 2nd time with Logical PVR (0x0f...) which
(mostly) overwrites the cur_cpu_spec with "architected" mode cpu_spec.
I don't see DD version specific entries for "architected" mode in
cpu_specs[] for any previous processors. So I've introduced this
function to tweak cpu_features.

Though, I don't know why we don't have similar thing for
CPU_FTR_POWER9_DD2_1. I've to check that.


And should we get a /* Power10 DD 1 */ added to cpu_specs[]?


IIUC, we don't need such entry. For PowerVM / kvm guests, we overwrite
cpu_spec, so /* Power10 */ "raw" entry is sufficient. And For baremetal,
we don't use cpu_specs[] at all.




  static int __init early_init_dt_scan_cpus(unsigned long node,
   const char *uname, int depth,
   void *data)
@@ -378,6 +386,7 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,

 check_cpu_feature_properties(node);
 check_cpu_pa_features(node);
+   fixup_cpu_features();
 }

 identical_pvr_fixup(node);
--
2.25.1



[PATCH v2 2/2] powerpc/watchpoint: Workaround P10 DD1 issue with VSX-32 byte instructions

2020-10-21 Thread Ravi Bangoria
POWER10 DD1 has an issue where it generates watchpoint exceptions when it
shouldn't. The conditions where this occur are:

 - octword op
 - ending address of DAWR range is less than starting address of op
 - those addresses need to be in the same or in two consecutive 512B
   blocks
 - 'op address + 64B' generates an address that has a carry into bit
   52 (crosses 2K boundary)

Handle such spurious exception by considering them as extraneous and
emulating/single-steeping instruction without generating an event.

Signed-off-by: Ravi Bangoria 
[Fixed build warning reported by kernel test robot]
Reported-by: kernel test robot 
---

Dependency: VSX-32 byte emulation support patches
  https://lore.kernel.org/r/20201011050908.72173-1-ravi.bango...@linux.ibm.com

 arch/powerpc/kernel/hw_breakpoint.c | 67 -
 1 file changed, 65 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
b/arch/powerpc/kernel/hw_breakpoint.c
index f4e8f21046f5..9e83dd3d2ec5 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -499,6 +499,11 @@ static bool is_larx_stcx_instr(int type)
return type == LARX || type == STCX;
 }
 
+static bool is_octword_vsx_instr(int type, int size)
+{
+   return ((type == LOAD_VSX || type == STORE_VSX) && size == 32);
+}
+
 /*
  * We've failed in reliably handling the hw-breakpoint. Unregister
  * it and throw a warning message to let the user know about it.
@@ -549,6 +554,58 @@ static bool stepping_handler(struct pt_regs *regs, struct 
perf_event **bp,
return true;
 }
 
+static void handle_p10dd1_spurious_exception(struct arch_hw_breakpoint **info,
+int *hit, unsigned long ea)
+{
+   int i;
+   unsigned long hw_end_addr;
+
+   /*
+* Handle spurious exception only when any bp_per_reg is set.
+* Otherwise this might be created by xmon and not actually a
+* spurious exception.
+*/
+   for (i = 0; i < nr_wp_slots(); i++) {
+   if (!info[i])
+   continue;
+
+   hw_end_addr = ALIGN(info[i]->address + info[i]->len, 
HW_BREAKPOINT_SIZE);
+
+   /*
+* Ending address of DAWR range is less than starting
+* address of op.
+*/
+   if ((hw_end_addr - 1) >= ea)
+   continue;
+
+   /*
+* Those addresses need to be in the same or in two
+* consecutive 512B blocks;
+*/
+   if (((hw_end_addr - 1) >> 10) != (ea >> 10))
+   continue;
+
+   /*
+* 'op address + 64B' generates an address that has a
+* carry into bit 52 (crosses 2K boundary).
+*/
+   if ((ea & 0x800) == ((ea + 64) & 0x800))
+   continue;
+
+   break;
+   }
+
+   if (i == nr_wp_slots())
+   return;
+
+   for (i = 0; i < nr_wp_slots(); i++) {
+   if (info[i]) {
+   hit[i] = 1;
+   info[i]->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
+   }
+   }
+}
+
 int hw_breakpoint_handler(struct die_args *args)
 {
bool err = false;
@@ -607,8 +664,14 @@ int hw_breakpoint_handler(struct die_args *args)
goto reset;
 
if (!nr_hit) {
-   rc = NOTIFY_DONE;
-   goto out;
+   if (cpu_has_feature(CPU_FTR_POWER10_DD1) &&
+   !IS_ENABLED(CONFIG_PPC_8xx) &&
+   is_octword_vsx_instr(type, size)) {
+   handle_p10dd1_spurious_exception(info, hit, ea);
+   } else {
+   rc = NOTIFY_DONE;
+   goto out;
+   }
}
 
/*
-- 
2.25.1



[PATCH v2 1/2] powerpc: Introduce POWER10_DD1 feature

2020-10-21 Thread Ravi Bangoria
POWER10_DD1 feature flag will be needed while adding
conditional code that applies only for Power10 DD1.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/cputable.h | 8 ++--
 arch/powerpc/kernel/dt_cpu_ftrs.c   | 3 +++
 arch/powerpc/kernel/prom.c  | 9 +
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index 93bc70d4c9a1..d486f56c0d33 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -216,6 +216,7 @@ static inline void cpu_feature_keys_init(void) { }
 #define CPU_FTR_P9_RADIX_PREFETCH_BUG  LONG_ASM_CONST(0x0002)
 #define CPU_FTR_ARCH_31
LONG_ASM_CONST(0x0004)
 #define CPU_FTR_DAWR1  LONG_ASM_CONST(0x0008)
+#define CPU_FTR_POWER10_DD1LONG_ASM_CONST(0x0010)
 
 #ifndef __ASSEMBLY__
 
@@ -479,6 +480,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_ARCH_31 | \
CPU_FTR_DAWR | CPU_FTR_DAWR1)
+#define CPU_FTRS_POWER10_DD1   (CPU_FTRS_POWER10 | CPU_FTR_POWER10_DD1)
 #define CPU_FTRS_CELL  (CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -497,14 +499,16 @@ static inline void cpu_feature_keys_init(void) { }
 #define CPU_FTRS_POSSIBLE  \
(CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | CPU_FTRS_POWER8 | \
 CPU_FTR_ALTIVEC_COMP | CPU_FTR_VSX_COMP | CPU_FTRS_POWER9 | \
-CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
+CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | 
\
+CPU_FTRS_POWER10_DD1)
 #else
 #define CPU_FTRS_POSSIBLE  \
(CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | \
 CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | \
 CPU_FTRS_POWER8 | CPU_FTRS_CELL | CPU_FTRS_PA6T | \
 CPU_FTR_VSX_COMP | CPU_FTR_ALTIVEC_COMP | CPU_FTRS_POWER9 | \
-CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
+CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | 
\
+CPU_FTRS_POWER10_DD1)
 #endif /* CONFIG_CPU_LITTLE_ENDIAN */
 #endif
 #else
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c 
b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 1098863e17ee..b2327f2967ff 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -811,6 +811,9 @@ static __init void cpufeatures_cpu_quirks(void)
}
 
update_tlbie_feature_flag(version);
+
+   if ((version & 0x) == 0x00800100)
+   cur_cpu_spec->cpu_features |= CPU_FTR_POWER10_DD1;
 }
 
 static void __init cpufeatures_setup_finished(void)
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index c1545f22c077..c778c81284f7 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -305,6 +305,14 @@ static void __init check_cpu_feature_properties(unsigned 
long node)
}
 }
 
+static void __init fixup_cpu_features(void)
+{
+   unsigned long version = mfspr(SPRN_PVR);
+
+   if ((version & 0x) == 0x00800100)
+   cur_cpu_spec->cpu_features |= CPU_FTR_POWER10_DD1;
+}
+
 static int __init early_init_dt_scan_cpus(unsigned long node,
  const char *uname, int depth,
  void *data)
@@ -378,6 +386,7 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,
 
check_cpu_feature_properties(node);
check_cpu_pa_features(node);
+   fixup_cpu_features();
}
 
identical_pvr_fixup(node);
-- 
2.25.1



[PATCH 1/2] powerpc: Introduce POWER10_DD1 feature

2020-10-19 Thread Ravi Bangoria
POWER10_DD1 feature flag will be needed while adding
conditional code that applies only for Power10 DD1.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/cputable.h | 8 ++--
 arch/powerpc/kernel/dt_cpu_ftrs.c   | 3 +++
 arch/powerpc/kernel/prom.c  | 9 +
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index 93bc70d4c9a1..d486f56c0d33 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -216,6 +216,7 @@ static inline void cpu_feature_keys_init(void) { }
 #define CPU_FTR_P9_RADIX_PREFETCH_BUG  LONG_ASM_CONST(0x0002)
 #define CPU_FTR_ARCH_31
LONG_ASM_CONST(0x0004)
 #define CPU_FTR_DAWR1  LONG_ASM_CONST(0x0008)
+#define CPU_FTR_POWER10_DD1LONG_ASM_CONST(0x0010)
 
 #ifndef __ASSEMBLY__
 
@@ -479,6 +480,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | CPU_FTR_ARCH_31 | \
CPU_FTR_DAWR | CPU_FTR_DAWR1)
+#define CPU_FTRS_POWER10_DD1   (CPU_FTRS_POWER10 | CPU_FTR_POWER10_DD1)
 #define CPU_FTRS_CELL  (CPU_FTR_LWSYNC | \
CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -497,14 +499,16 @@ static inline void cpu_feature_keys_init(void) { }
 #define CPU_FTRS_POSSIBLE  \
(CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | CPU_FTRS_POWER8 | \
 CPU_FTR_ALTIVEC_COMP | CPU_FTR_VSX_COMP | CPU_FTRS_POWER9 | \
-CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
+CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | 
\
+CPU_FTRS_POWER10_DD1)
 #else
 #define CPU_FTRS_POSSIBLE  \
(CPU_FTRS_PPC970 | CPU_FTRS_POWER5 | \
 CPU_FTRS_POWER6 | CPU_FTRS_POWER7 | CPU_FTRS_POWER8E | \
 CPU_FTRS_POWER8 | CPU_FTRS_CELL | CPU_FTRS_PA6T | \
 CPU_FTR_VSX_COMP | CPU_FTR_ALTIVEC_COMP | CPU_FTRS_POWER9 | \
-CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10)
+CPU_FTRS_POWER9_DD2_1 | CPU_FTRS_POWER9_DD2_2 | CPU_FTRS_POWER10 | 
\
+CPU_FTRS_POWER10_DD1)
 #endif /* CONFIG_CPU_LITTLE_ENDIAN */
 #endif
 #else
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c 
b/arch/powerpc/kernel/dt_cpu_ftrs.c
index 1098863e17ee..b2327f2967ff 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -811,6 +811,9 @@ static __init void cpufeatures_cpu_quirks(void)
}
 
update_tlbie_feature_flag(version);
+
+   if ((version & 0x) == 0x00800100)
+   cur_cpu_spec->cpu_features |= CPU_FTR_POWER10_DD1;
 }
 
 static void __init cpufeatures_setup_finished(void)
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index c1545f22c077..c778c81284f7 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -305,6 +305,14 @@ static void __init check_cpu_feature_properties(unsigned 
long node)
}
 }
 
+static void __init fixup_cpu_features(void)
+{
+   unsigned long version = mfspr(SPRN_PVR);
+
+   if ((version & 0x) == 0x00800100)
+   cur_cpu_spec->cpu_features |= CPU_FTR_POWER10_DD1;
+}
+
 static int __init early_init_dt_scan_cpus(unsigned long node,
  const char *uname, int depth,
  void *data)
@@ -378,6 +386,7 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,
 
check_cpu_feature_properties(node);
check_cpu_pa_features(node);
+   fixup_cpu_features();
}
 
identical_pvr_fixup(node);
-- 
2.25.1



[PATCH 2/2] powerpc/watchpoint: Workaround P10 DD1 issue with VSX-32 byte instructions

2020-10-19 Thread Ravi Bangoria
POWER10 DD1 has an issue where it generates watchpoint exceptions when it
shouldn't. The conditions where this occur are:

 - octword op
 - ending address of DAWR range is less than starting address of op
 - those addresses need to be in the same or in two consecutive 512B
   blocks
 - 'op address + 64B' generates an address that has a carry into bit
   52 (crosses 2K boundary)

Handle such spurious exception by considering them as extraneous and
emulating/single-steeping instruction without generating an event.

Signed-off-by: Ravi Bangoria 
---

Dependency: VSX-32 byte emulation support patches
  https://lore.kernel.org/r/20201011050908.72173-1-ravi.bango...@linux.ibm.com

 arch/powerpc/kernel/hw_breakpoint.c | 69 -
 1 file changed, 67 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
b/arch/powerpc/kernel/hw_breakpoint.c
index f4e8f21046f5..4514745d27c3 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -499,6 +499,11 @@ static bool is_larx_stcx_instr(int type)
return type == LARX || type == STCX;
 }
 
+static bool is_octword_vsx_instr(int type, int size)
+{
+   return ((type == LOAD_VSX || type == STORE_VSX) && size == 32);
+}
+
 /*
  * We've failed in reliably handling the hw-breakpoint. Unregister
  * it and throw a warning message to let the user know about it.
@@ -549,6 +554,60 @@ static bool stepping_handler(struct pt_regs *regs, struct 
perf_event **bp,
return true;
 }
 
+static void handle_p10dd1_spurious_exception(struct arch_hw_breakpoint **info,
+int *hit, unsigned long ea)
+{
+   int i;
+   unsigned long hw_start_addr;
+   unsigned long hw_end_addr;
+
+   /*
+* Handle spurious exception only when any bp_per_reg is set.
+* Otherwise this might be created by xmon and not actually a
+* spurious exception.
+*/
+   for (i = 0; i < nr_wp_slots(); i++) {
+   if (!info[i])
+   continue;
+
+   hw_start_addr = ALIGN_DOWN(info[i]->address, 
HW_BREAKPOINT_SIZE);
+   hw_end_addr = ALIGN(info[i]->address + info[i]->len, 
HW_BREAKPOINT_SIZE);
+
+   /*
+* Ending address of DAWR range is less than starting
+* address of op.
+*/
+   if ((hw_end_addr - 1) >= ea)
+   continue;
+
+   /*
+* Those addresses need to be in the same or in two
+* consecutive 512B blocks;
+*/
+   if (((hw_end_addr - 1) >> 10) != (ea >> 10))
+   continue;
+
+   /*
+* 'op address + 64B' generates an address that has a
+* carry into bit 52 (crosses 2K boundary).
+*/
+   if ((ea & 0x800) == ((ea + 64) & 0x800))
+   continue;
+
+   break;
+   }
+
+   if (i == nr_wp_slots())
+   return;
+
+   for (i = 0; i < nr_wp_slots(); i++) {
+   if (info[i]) {
+   hit[i] = 1;
+   info[i]->type |= HW_BRK_TYPE_EXTRANEOUS_IRQ;
+   }
+   }
+}
+
 int hw_breakpoint_handler(struct die_args *args)
 {
bool err = false;
@@ -607,8 +666,14 @@ int hw_breakpoint_handler(struct die_args *args)
goto reset;
 
if (!nr_hit) {
-   rc = NOTIFY_DONE;
-   goto out;
+   if (cpu_has_feature(CPU_FTR_POWER10_DD1) &&
+   !IS_ENABLED(CONFIG_PPC_8xx) &&
+   is_octword_vsx_instr(type, size)) {
+   handle_p10dd1_spurious_exception(info, hit, ea);
+   } else {
+   rc = NOTIFY_DONE;
+   goto out;
+   }
}
 
/*
-- 
2.25.1



Re: [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set

2020-10-14 Thread Ravi Bangoria

Hi Daniel,

On 10/12/20 7:14 PM, Daniel Axtens wrote:

Hi,

To review this, I looked through the supported instructions to see if
there were any that I thought might have been missed.

I didn't find any other v3.1 ones, although I don't have a v3.1 ISA to
hand so I was basically looking for instructions I didn't recognise.

I did, however, find a number of instructions that are new in ISA 3.0
that aren't guarded:

  - addpcis
  - lxvl/stxvl
  - lxvll/stxvll
  - lxvwsx
  - stxvx
  - lxsipzx
  - lxvh8x
  - lxsihzx
  - lxvb16x/stxvb16x
  - stxsibx
  - stxsihx
  - lxvb16x
  - lxsd/stxsd
  - lxssp/stxssp
  - lxv/stxv
  
Also, I don't know how bothered we are about P7, but if I'm reading the

ISA correctly, lqarx/stqcx. are not supported before ISA 2.07. Likewise
a number of the vector instructions like lxsiwzx and lxsiwax (and the
companion stores).

I realise it's not really the point of this particular patch, so I don't
think this should block acceptance. What I would like to know - and
maybe this is something where we need mpe to weigh in - is whether we
need consistent guards for 2.07 and 3.0. We have some 3.0 guards already
but clearly not everywhere.


Yes, those needs to be handled properly. Otherwise they can be harmful
when emulated on a non-supporting platform. Will work on it when I get
some time. Thanks for reporting it.



With all that said - the patch does what it says it does, and looks good
to me:

Reviewed-by: Daniel Axtens 


Thanks!
Ravi


Re: [PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set

2020-10-12 Thread Ravi Bangoria

Hi Daniel,

On 10/12/20 7:21 AM, Daniel Axtens wrote:

Hi,

Apologies if this has come up in a previous revision.



case 1:
+   if (!cpu_has_feature(CPU_FTR_ARCH_31))
+   return -1;
+
prefix_r = GET_PREFIX_R(word);
ra = GET_PREFIX_RA(suffix);


The comment above analyse_instr reads in part:

  * Return value is 1 if the instruction can be emulated just by
  * updating *regs with the information in *op, -1 if we need the
  * GPRs but *regs doesn't contain the full register set, or 0
  * otherwise.

I was wondering why returning -1 if the instruction isn't supported the
right thing to do - it seemed to me that it should return 0?

I did look and see that there are other cases where the code returns -1
for an unsupported operation, e.g.:

#ifdef __powerpc64__
case 4:
if (!cpu_has_feature(CPU_FTR_ARCH_300))
return -1;

switch (word & 0x3f) {
case 48:/* maddhd */

That's from commit 930d6288a267 ("powerpc: sstep: Add support for
maddhd, maddhdu, maddld instructions"), but it's not explained there either

I see the same pattern in a number of commits: commit 6324320de609
("powerpc sstep: Add support for modsd, modud instructions"), commit
6c180071509a ("powerpc sstep: Add support for modsw, moduw
instructions"), commit a23987ef267a ("powerpc: sstep: Add support for
darn instruction") and a few others, all of which seem to have come
through Sandipan in February 2019. I haven't spotted any explanation for
why -1 was chosen, but I haven't checked the mailing list archives.

If -1 is OK, would it be possible to update the comment to explain why?


Agreed. As you rightly pointed out, there are many more such cases and, yes,
we are aware of this issue and it's being worked upon as a separate patch.
(Sandipan did send a fix patch internally some time back).

Thanks for pointing it out!
Ravi


[PATCH v5 5/5] powerpc/sstep: Add testcases for VSX vector paired load/store instructions

2020-10-10 Thread Ravi Bangoria
From: Balamuruhan S 

Add testcases for VSX vector paired load/store instructions.
Sample o/p:

  emulate_step_test: lxvp   : PASS
  emulate_step_test: stxvp  : PASS
  emulate_step_test: lxvpx  : PASS
  emulate_step_test: stxvpx : PASS
  emulate_step_test: plxvp  : PASS
  emulate_step_test: pstxvp : PASS

Signed-off-by: Balamuruhan S 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/lib/test_emulate_step.c | 270 +++
 1 file changed, 270 insertions(+)

diff --git a/arch/powerpc/lib/test_emulate_step.c 
b/arch/powerpc/lib/test_emulate_step.c
index 0a201b771477..783d1b85ecfe 100644
--- a/arch/powerpc/lib/test_emulate_step.c
+++ b/arch/powerpc/lib/test_emulate_step.c
@@ -612,6 +612,273 @@ static void __init test_lxvd2x_stxvd2x(void)
 }
 #endif /* CONFIG_VSX */
 
+#ifdef CONFIG_VSX
+static void __init test_lxvp_stxvp(void)
+{
+   struct pt_regs regs;
+   union {
+   vector128 a;
+   u32 b[4];
+   } c[2];
+   u32 cached_b[8];
+   int stepped = -1;
+
+   if (!cpu_has_feature(CPU_FTR_ARCH_31)) {
+   show_result("lxvp", "SKIP (!CPU_FTR_ARCH_31)");
+   show_result("stxvp", "SKIP (!CPU_FTR_ARCH_31)");
+   return;
+   }
+
+   init_pt_regs();
+
+   /*** lxvp ***/
+
+   cached_b[0] = c[0].b[0] = 18233;
+   cached_b[1] = c[0].b[1] = 34863571;
+   cached_b[2] = c[0].b[2] = 834;
+   cached_b[3] = c[0].b[3] = 6138911;
+   cached_b[4] = c[1].b[0] = 1234;
+   cached_b[5] = c[1].b[1] = 5678;
+   cached_b[6] = c[1].b[2] = 91011;
+   cached_b[7] = c[1].b[3] = 121314;
+
+   regs.gpr[4] = (unsigned long)[0].a;
+
+   /*
+* lxvp XTp,DQ(RA)
+* XTp = 32xTX + 2xTp
+* let TX=1 Tp=1 RA=4 DQ=0
+*/
+   stepped = emulate_step(, ppc_inst(PPC_RAW_LXVP(34, 4, 0)));
+
+   if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
+   show_result("lxvp", "PASS");
+   } else {
+   if (!cpu_has_feature(CPU_FTR_VSX))
+   show_result("lxvp", "PASS (!CPU_FTR_VSX)");
+   else
+   show_result("lxvp", "FAIL");
+   }
+
+   /*** stxvp ***/
+
+   c[0].b[0] = 21379463;
+   c[0].b[1] = 87;
+   c[0].b[2] = 374234;
+   c[0].b[3] = 4;
+   c[1].b[0] = 90;
+   c[1].b[1] = 122;
+   c[1].b[2] = 555;
+   c[1].b[3] = 32144;
+
+   /*
+* stxvp XSp,DQ(RA)
+* XSp = 32xSX + 2xSp
+* let SX=1 Sp=1 RA=4 DQ=0
+*/
+   stepped = emulate_step(, ppc_inst(PPC_RAW_STXVP(34, 4, 0)));
+
+   if (stepped == 1 && cached_b[0] == c[0].b[0] && cached_b[1] == 
c[0].b[1] &&
+   cached_b[2] == c[0].b[2] && cached_b[3] == c[0].b[3] &&
+   cached_b[4] == c[1].b[0] && cached_b[5] == c[1].b[1] &&
+   cached_b[6] == c[1].b[2] && cached_b[7] == c[1].b[3] &&
+   cpu_has_feature(CPU_FTR_VSX)) {
+   show_result("stxvp", "PASS");
+   } else {
+   if (!cpu_has_feature(CPU_FTR_VSX))
+   show_result("stxvp", "PASS (!CPU_FTR_VSX)");
+   else
+   show_result("stxvp", "FAIL");
+   }
+}
+#else
+static void __init test_lxvp_stxvp(void)
+{
+   show_result("lxvp", "SKIP (CONFIG_VSX is not set)");
+   show_result("stxvp", "SKIP (CONFIG_VSX is not set)");
+}
+#endif /* CONFIG_VSX */
+
+#ifdef CONFIG_VSX
+static void __init test_lxvpx_stxvpx(void)
+{
+   struct pt_regs regs;
+   union {
+   vector128 a;
+   u32 b[4];
+   } c[2];
+   u32 cached_b[8];
+   int stepped = -1;
+
+   if (!cpu_has_feature(CPU_FTR_ARCH_31)) {
+   show_result("lxvpx", "SKIP (!CPU_FTR_ARCH_31)");
+   show_result("stxvpx", "SKIP (!CPU_FTR_ARCH_31)");
+   return;
+   }
+
+   init_pt_regs();
+
+   /*** lxvpx ***/
+
+   cached_b[0] = c[0].b[0] = 18233;
+   cached_b[1] = c[0].b[1] = 34863571;
+   cached_b[2] = c[0].b[2] = 834;
+   cached_b[3] = c[0].b[3] = 6138911;
+   cached_b[4] = c[1].b[0] = 1234;
+   cached_b[5] = c[1].b[1] = 5678;
+   cached_b[6] = c[1].b[2] = 91011;
+   cached_b[7] = c[1].b[3] = 121314;
+
+   regs.gpr[3] = (unsigned long)[0].a;
+   regs.gpr[4] = 0;
+
+   /*
+* lxvpx XTp,RA,RB
+* XTp = 32xTX + 2xTp
+* let TX=1 Tp=1 RA=3 RB=4
+*/
+   stepped = emulate_step(, ppc_inst(PPC_RAW_LXVPX(34, 3, 4)));
+
+   if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
+   show_result("lxvpx", "PASS&qu

[PATCH v5 4/5] powerpc/ppc-opcode: Add encoding macros for VSX vector paired instructions

2020-10-10 Thread Ravi Bangoria
From: Balamuruhan S 

Add instruction encodings, DQ, D0, D1 immediate, XTP, XSP operands as
macros for new VSX vector paired instructions,
  * Load VSX Vector Paired (lxvp)
  * Load VSX Vector Paired Indexed (lxvpx)
  * Prefixed Load VSX Vector Paired (plxvp)
  * Store VSX Vector Paired (stxvp)
  * Store VSX Vector Paired Indexed (stxvpx)
  * Prefixed Store VSX Vector Paired (pstxvp)

Suggested-by: Naveen N. Rao 
Signed-off-by: Balamuruhan S 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/ppc-opcode.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index a6e3700c4566..5e7918ca4fb7 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -78,6 +78,9 @@
 
 #define IMM_L(i)   ((uintptr_t)(i) & 0x)
 #define IMM_DS(i)  ((uintptr_t)(i) & 0xfffc)
+#define IMM_DQ(i)  ((uintptr_t)(i) & 0xfff0)
+#define IMM_D0(i)  (((uintptr_t)(i) >> 16) & 0x3)
+#define IMM_D1(i)  IMM_L(i)
 
 /*
  * 16-bit immediate helper macros: HA() is for use with sign-extending instrs
@@ -295,6 +298,8 @@
 #define __PPC_XB(b)b) & 0x1f) << 11) | (((b) & 0x20) >> 4))
 #define __PPC_XS(s)s) & 0x1f) << 21) | (((s) & 0x20) >> 5))
 #define __PPC_XT(s)__PPC_XS(s)
+#define __PPC_XSP(s)   s) & 0x1e) | (((s) >> 5) & 0x1)) << 21)
+#define __PPC_XTP(s)   __PPC_XSP(s)
 #define __PPC_T_TLB(t) (((t) & 0x3) << 21)
 #define __PPC_WC(w)(((w) & 0x3) << 21)
 #define __PPC_WS(w)(((w) & 0x1f) << 11)
@@ -395,6 +400,14 @@
 #define PPC_RAW_XVCPSGNDP(t, a, b) ((0xf780 | VSX_XX3((t), (a), (b
 #define PPC_RAW_VPERMXOR(vrt, vra, vrb, vrc) \
((0x102d | ___PPC_RT(vrt) | ___PPC_RA(vra) | ___PPC_RB(vrb) | 
(((vrc) & 0x1f) << 6)))
+#define PPC_RAW_LXVP(xtp, a, i)(0x1800 | __PPC_XTP(xtp) | 
___PPC_RA(a) | IMM_DQ(i))
+#define PPC_RAW_STXVP(xsp, a, i)   (0x1801 | __PPC_XSP(xsp) | 
___PPC_RA(a) | IMM_DQ(i))
+#define PPC_RAW_LXVPX(xtp, a, b)   (0x7c00029a | __PPC_XTP(xtp) | 
___PPC_RA(a) | ___PPC_RB(b))
+#define PPC_RAW_STXVPX(xsp, a, b)  (0x7c00039a | __PPC_XSP(xsp) | 
___PPC_RA(a) | ___PPC_RB(b))
+#define PPC_RAW_PLXVP(xtp, i, a, pr) \
+   ((PPC_PREFIX_8LS | __PPC_PRFX_R(pr) | IMM_D0(i)) << 32 | (0xe800 | 
__PPC_XTP(xtp) | ___PPC_RA(a) | IMM_D1(i)))
+#define PPC_RAW_PSTXVP(xsp, i, a, pr) \
+   ((PPC_PREFIX_8LS | __PPC_PRFX_R(pr) | IMM_D0(i)) << 32 | (0xf800 | 
__PPC_XSP(xsp) | ___PPC_RA(a) | IMM_D1(i)))
 #define PPC_RAW_NAP(0x4c000364)
 #define PPC_RAW_SLEEP  (0x4c0003a4)
 #define PPC_RAW_WINKLE (0x4c0003e4)
-- 
2.26.2



[PATCH v5 3/5] powerpc/sstep: Support VSX vector paired storage access instructions

2020-10-10 Thread Ravi Bangoria
From: Balamuruhan S 

VSX Vector Paired instructions loads/stores an octword (32 bytes)
from/to storage into two sequential VSRs. Add emulation support
for these new instructions:
  * Load VSX Vector Paired (lxvp)
  * Load VSX Vector Paired Indexed (lxvpx)
  * Prefixed Load VSX Vector Paired (plxvp)
  * Store VSX Vector Paired (stxvp)
  * Store VSX Vector Paired Indexed (stxvpx)
  * Prefixed Store VSX Vector Paired (pstxvp)

Suggested-by: Naveen N. Rao 
Signed-off-by: Balamuruhan S 
Signed-off-by: Ravi Bangoria 
[kernel test robot reported a build failure]
Reported-by: kernel test robot 
---
 arch/powerpc/lib/sstep.c | 150 +--
 1 file changed, 129 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index faf0bbf3efb7..96ca813a65e7 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -32,6 +32,10 @@ extern char system_call_vectored_emulate[];
 #define XER_OV32   0x0008U
 #define XER_CA32   0x0004U
 
+#ifdef CONFIG_VSX
+#define VSX_REGISTER_XTP(rd)   rd) & 1) << 5) | ((rd) & 0xfe))
+#endif
+
 #ifdef CONFIG_PPC_FPU
 /*
  * Functions in ldstfp.S
@@ -279,6 +283,19 @@ static nokprobe_inline void do_byte_reverse(void *ptr, int 
nb)
up[1] = tmp;
break;
}
+   case 32: {
+   unsigned long *up = (unsigned long *)ptr;
+   unsigned long tmp;
+
+   tmp = byterev_8(up[0]);
+   up[0] = byterev_8(up[3]);
+   up[3] = tmp;
+   tmp = byterev_8(up[2]);
+   up[2] = byterev_8(up[1]);
+   up[1] = tmp;
+   break;
+   }
+
 #endif
default:
WARN_ON_ONCE(1);
@@ -709,6 +726,8 @@ void emulate_vsx_load(struct instruction_op *op, union 
vsx_reg *reg,
reg->d[0] = reg->d[1] = 0;
 
switch (op->element_size) {
+   case 32:
+   /* [p]lxvp[x] */
case 16:
/* whole vector; lxv[x] or lxvl[l] */
if (size == 0)
@@ -717,7 +736,7 @@ void emulate_vsx_load(struct instruction_op *op, union 
vsx_reg *reg,
if (IS_LE && (op->vsx_flags & VSX_LDLEFT))
rev = !rev;
if (rev)
-   do_byte_reverse(reg, 16);
+   do_byte_reverse(reg, size);
break;
case 8:
/* scalar loads, lxvd2x, lxvdsx */
@@ -793,6 +812,20 @@ void emulate_vsx_store(struct instruction_op *op, const 
union vsx_reg *reg,
size = GETSIZE(op->type);
 
switch (op->element_size) {
+   case 32:
+   /* [p]stxvp[x] */
+   if (size == 0)
+   break;
+   if (rev) {
+   /* reverse 32 bytes */
+   buf.d[0] = byterev_8(reg->d[3]);
+   buf.d[1] = byterev_8(reg->d[2]);
+   buf.d[2] = byterev_8(reg->d[1]);
+   buf.d[3] = byterev_8(reg->d[0]);
+   reg = 
+   }
+   memcpy(mem, reg, size);
+   break;
case 16:
/* stxv, stxvx, stxvl, stxvll */
if (size == 0)
@@ -861,28 +894,43 @@ static nokprobe_inline int do_vsx_load(struct 
instruction_op *op,
   bool cross_endian)
 {
int reg = op->reg;
-   u8 mem[16];
-   union vsx_reg buf;
+   int i, j, nr_vsx_regs;
+   u8 mem[32];
+   union vsx_reg buf[2];
int size = GETSIZE(op->type);
 
if (!address_ok(regs, ea, size) || copy_mem_in(mem, ea, size, regs))
return -EFAULT;
 
-   emulate_vsx_load(op, , mem, cross_endian);
+   nr_vsx_regs = size / sizeof(__vector128);
+   emulate_vsx_load(op, buf, mem, cross_endian);
preempt_disable();
if (reg < 32) {
/* FP regs + extensions */
if (regs->msr & MSR_FP) {
-   load_vsrn(reg, );
+   for (i = 0; i < nr_vsx_regs; i++) {
+   j = IS_LE ? nr_vsx_regs - i - 1 : i;
+   load_vsrn(reg + i, [j].v);
+   }
} else {
-   current->thread.fp_state.fpr[reg][0] = buf.d[0];
-   current->thread.fp_state.fpr[reg][1] = buf.d[1];
+   for (i = 0; i < nr_vsx_regs; i++) {
+   j = IS_LE ? nr_vsx_regs - i - 1 : i;
+   current->thread.fp_state.fpr[reg + i][0] = 
buf[j].d[0];
+   current->thread.fp_state.fpr[reg + i][1] = 
buf[j].d[1];
+   }
}
} else {
-   if (regs->msr & MSR_VEC)
-   load_vsrn(r

[PATCH v5 2/5] powerpc/sstep: Cover new VSX instructions under CONFIG_VSX

2020-10-10 Thread Ravi Bangoria
Recently added Power10 prefixed VSX instruction are included
unconditionally in the kernel. If they are executed on a
machine without VSX support, it might create issues. Fix that.
Also fix one mnemonics spelling mistake in comment.

Fixes: 50b80a12e4cc ("powerpc sstep: Add support for prefixed load/stores")
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/lib/sstep.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index e6242744c71b..faf0bbf3efb7 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -2757,6 +2757,7 @@ int analyse_instr(struct instruction_op *op, const struct 
pt_regs *regs,
case 41:/* plwa */
op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 4);
break;
+#ifdef CONFIG_VSX
case 42:/* plxsd */
op->reg = rd + 32;
op->type = MKOP(LOAD_VSX, PREFIXED, 8);
@@ -2797,13 +2798,14 @@ int analyse_instr(struct instruction_op *op, const 
struct pt_regs *regs,
op->element_size = 16;
op->vsx_flags = VSX_CHECK_VEC;
break;
+#endif /* CONFIG_VSX */
case 56:/* plq */
op->type = MKOP(LOAD, PREFIXED, 16);
break;
case 57:/* pld */
op->type = MKOP(LOAD, PREFIXED, 8);
break;
-   case 60:/* stq */
+   case 60:/* pstq */
op->type = MKOP(STORE, PREFIXED, 16);
break;
case 61:/* pstd */
-- 
2.26.2



[PATCH v5 1/5] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set

2020-10-10 Thread Ravi Bangoria
From: Balamuruhan S 

Unconditional emulation of prefixed instructions will allow
emulation of them on Power10 predecessors which might cause
issues. Restrict that.

Fixes: 3920742b92f5 ("powerpc sstep: Add support for prefixed fixed-point 
arithmetic")
Fixes: 50b80a12e4cc ("powerpc sstep: Add support for prefixed load/stores")
Signed-off-by: Balamuruhan S 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/lib/sstep.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index e9dcaba9a4f8..e6242744c71b 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1346,6 +1346,9 @@ int analyse_instr(struct instruction_op *op, const struct 
pt_regs *regs,
switch (opcode) {
 #ifdef __powerpc64__
case 1:
+   if (!cpu_has_feature(CPU_FTR_ARCH_31))
+   return -1;
+
prefix_r = GET_PREFIX_R(word);
ra = GET_PREFIX_RA(suffix);
rd = (suffix >> 21) & 0x1f;
@@ -2733,6 +2736,9 @@ int analyse_instr(struct instruction_op *op, const struct 
pt_regs *regs,
}
break;
case 1: /* Prefixed instructions */
+   if (!cpu_has_feature(CPU_FTR_ARCH_31))
+   return -1;
+
prefix_r = GET_PREFIX_R(word);
ra = GET_PREFIX_RA(suffix);
op->update_reg = ra;
-- 
2.26.2



[PATCH v5 0/5] powerpc/sstep: VSX 32-byte vector paired load/store instructions

2020-10-10 Thread Ravi Bangoria
VSX vector paired instructions operates with octword (32-byte)
operand for loads and stores between storage and a pair of two
sequential Vector-Scalar Registers (VSRs). There are 4 word
instructions and 2 prefixed instructions that provides this
32-byte storage access operations - lxvp, lxvpx, stxvp, stxvpx,
plxvp, pstxvp.

Emulation infrastructure doesn't have support for these instructions,
to operate with 32-byte storage access and to operate with 2 VSX
registers. This patch series enables the instruction emulation
support and adds test cases for them respectively.

v4: 
https://lore.kernel.org/r/20201008072726.233086-1-ravi.bango...@linux.ibm.com

Changes in v5:
-
* Fix build breakage reported by Kernel test robote
* Patch #2 is new. CONFIG_VSX check was missing for some VSX
  instructions. Patch #2 adds that check.

Changes in v4:
-
* Patch #1 is (kind of) new.
* Patch #2 now enables both analyse_instr() and emulate_step()
  unlike prev series where both were in separate patches.
* Patch #2 also has important fix for emulation on LE.
* Patch #3 and #4. Added XSP/XTP, D0/D1 instruction operands,
  removed *_EX_OP, __PPC_T[P][X] macros which are incorrect,
  and adhered to PPC_RAW_* convention.
* Added `CPU_FTR_ARCH_31` check in testcases to avoid failing
  in p8/p9.
* Some consmetic changes.
* Rebased to powerpc/next

Changes in v3:
-
Worked on review comments and suggestions from Ravi and Naveen,

* Fix the do_vsx_load() to handle vsx instructions if MSR_FP/MSR_VEC
  cleared in exception conditions and it reaches to read/write to
  thread_struct member fp_state/vr_state respectively.
* Fix wrongly used `__vector128 v[2]` in struct vsx_reg as it should
  hold a single vsx register size.
* Remove unnecessary `VSX_CHECK_VEC` flag set and condition to check
  `VSX_LDLEFT` that is not applicable for these vsx instructions.
* Fix comments in emulate_vsx_load() that were misleading.
* Rebased on latest powerpc next branch.

Changes in v2:
-
* Fix suggestion from Sandipan, wrap ISA 3.1 instructions with
  cpu_has_feature(CPU_FTR_ARCH_31) check.


Balamuruhan S (4):
  powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31
is set
  powerpc/sstep: Support VSX vector paired storage access instructions
  powerpc/ppc-opcode: Add encoding macros for VSX vector paired
instructions
  powerpc/sstep: Add testcases for VSX vector paired load/store
instructions

Ravi Bangoria (1):
  powerpc/sstep: Cover new VSX instructions under CONFIG_VSX

 arch/powerpc/include/asm/ppc-opcode.h |  13 ++
 arch/powerpc/lib/sstep.c  | 160 ---
 arch/powerpc/lib/test_emulate_step.c  | 270 ++
 3 files changed, 421 insertions(+), 22 deletions(-)

-- 
2.26.2



[PATCH v4 4/4] powerpc/sstep: Add testcases for VSX vector paired load/store instructions

2020-10-08 Thread Ravi Bangoria
From: Balamuruhan S 

Add testcases for VSX vector paired load/store instructions.
Sample o/p:

  emulate_step_test: lxvp   : PASS
  emulate_step_test: stxvp  : PASS
  emulate_step_test: lxvpx  : PASS
  emulate_step_test: stxvpx : PASS
  emulate_step_test: plxvp  : PASS
  emulate_step_test: pstxvp : PASS

Signed-off-by: Balamuruhan S 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/lib/test_emulate_step.c | 270 +++
 1 file changed, 270 insertions(+)

diff --git a/arch/powerpc/lib/test_emulate_step.c 
b/arch/powerpc/lib/test_emulate_step.c
index 0a201b771477..783d1b85ecfe 100644
--- a/arch/powerpc/lib/test_emulate_step.c
+++ b/arch/powerpc/lib/test_emulate_step.c
@@ -612,6 +612,273 @@ static void __init test_lxvd2x_stxvd2x(void)
 }
 #endif /* CONFIG_VSX */
 
+#ifdef CONFIG_VSX
+static void __init test_lxvp_stxvp(void)
+{
+   struct pt_regs regs;
+   union {
+   vector128 a;
+   u32 b[4];
+   } c[2];
+   u32 cached_b[8];
+   int stepped = -1;
+
+   if (!cpu_has_feature(CPU_FTR_ARCH_31)) {
+   show_result("lxvp", "SKIP (!CPU_FTR_ARCH_31)");
+   show_result("stxvp", "SKIP (!CPU_FTR_ARCH_31)");
+   return;
+   }
+
+   init_pt_regs();
+
+   /*** lxvp ***/
+
+   cached_b[0] = c[0].b[0] = 18233;
+   cached_b[1] = c[0].b[1] = 34863571;
+   cached_b[2] = c[0].b[2] = 834;
+   cached_b[3] = c[0].b[3] = 6138911;
+   cached_b[4] = c[1].b[0] = 1234;
+   cached_b[5] = c[1].b[1] = 5678;
+   cached_b[6] = c[1].b[2] = 91011;
+   cached_b[7] = c[1].b[3] = 121314;
+
+   regs.gpr[4] = (unsigned long)[0].a;
+
+   /*
+* lxvp XTp,DQ(RA)
+* XTp = 32xTX + 2xTp
+* let TX=1 Tp=1 RA=4 DQ=0
+*/
+   stepped = emulate_step(, ppc_inst(PPC_RAW_LXVP(34, 4, 0)));
+
+   if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
+   show_result("lxvp", "PASS");
+   } else {
+   if (!cpu_has_feature(CPU_FTR_VSX))
+   show_result("lxvp", "PASS (!CPU_FTR_VSX)");
+   else
+   show_result("lxvp", "FAIL");
+   }
+
+   /*** stxvp ***/
+
+   c[0].b[0] = 21379463;
+   c[0].b[1] = 87;
+   c[0].b[2] = 374234;
+   c[0].b[3] = 4;
+   c[1].b[0] = 90;
+   c[1].b[1] = 122;
+   c[1].b[2] = 555;
+   c[1].b[3] = 32144;
+
+   /*
+* stxvp XSp,DQ(RA)
+* XSp = 32xSX + 2xSp
+* let SX=1 Sp=1 RA=4 DQ=0
+*/
+   stepped = emulate_step(, ppc_inst(PPC_RAW_STXVP(34, 4, 0)));
+
+   if (stepped == 1 && cached_b[0] == c[0].b[0] && cached_b[1] == 
c[0].b[1] &&
+   cached_b[2] == c[0].b[2] && cached_b[3] == c[0].b[3] &&
+   cached_b[4] == c[1].b[0] && cached_b[5] == c[1].b[1] &&
+   cached_b[6] == c[1].b[2] && cached_b[7] == c[1].b[3] &&
+   cpu_has_feature(CPU_FTR_VSX)) {
+   show_result("stxvp", "PASS");
+   } else {
+   if (!cpu_has_feature(CPU_FTR_VSX))
+   show_result("stxvp", "PASS (!CPU_FTR_VSX)");
+   else
+   show_result("stxvp", "FAIL");
+   }
+}
+#else
+static void __init test_lxvp_stxvp(void)
+{
+   show_result("lxvp", "SKIP (CONFIG_VSX is not set)");
+   show_result("stxvp", "SKIP (CONFIG_VSX is not set)");
+}
+#endif /* CONFIG_VSX */
+
+#ifdef CONFIG_VSX
+static void __init test_lxvpx_stxvpx(void)
+{
+   struct pt_regs regs;
+   union {
+   vector128 a;
+   u32 b[4];
+   } c[2];
+   u32 cached_b[8];
+   int stepped = -1;
+
+   if (!cpu_has_feature(CPU_FTR_ARCH_31)) {
+   show_result("lxvpx", "SKIP (!CPU_FTR_ARCH_31)");
+   show_result("stxvpx", "SKIP (!CPU_FTR_ARCH_31)");
+   return;
+   }
+
+   init_pt_regs();
+
+   /*** lxvpx ***/
+
+   cached_b[0] = c[0].b[0] = 18233;
+   cached_b[1] = c[0].b[1] = 34863571;
+   cached_b[2] = c[0].b[2] = 834;
+   cached_b[3] = c[0].b[3] = 6138911;
+   cached_b[4] = c[1].b[0] = 1234;
+   cached_b[5] = c[1].b[1] = 5678;
+   cached_b[6] = c[1].b[2] = 91011;
+   cached_b[7] = c[1].b[3] = 121314;
+
+   regs.gpr[3] = (unsigned long)[0].a;
+   regs.gpr[4] = 0;
+
+   /*
+* lxvpx XTp,RA,RB
+* XTp = 32xTX + 2xTp
+* let TX=1 Tp=1 RA=3 RB=4
+*/
+   stepped = emulate_step(, ppc_inst(PPC_RAW_LXVPX(34, 3, 4)));
+
+   if (stepped == 1 && cpu_has_feature(CPU_FTR_VSX)) {
+   show_result("lxvpx", "PASS&qu

[PATCH v4 3/4] powerpc/ppc-opcode: Add encoding macros for VSX vector paired instructions

2020-10-08 Thread Ravi Bangoria
From: Balamuruhan S 

Add instruction encodings, DQ, D0, D1 immediate, XTP, XSP operands as
macros for new VSX vector paired instructions,
  * Load VSX Vector Paired (lxvp)
  * Load VSX Vector Paired Indexed (lxvpx)
  * Prefixed Load VSX Vector Paired (plxvp)
  * Store VSX Vector Paired (stxvp)
  * Store VSX Vector Paired Indexed (stxvpx)
  * Prefixed Store VSX Vector Paired (pstxvp)

Suggested-by: Naveen N. Rao 
Signed-off-by: Balamuruhan S 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/ppc-opcode.h | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index a6e3700c4566..5e7918ca4fb7 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -78,6 +78,9 @@
 
 #define IMM_L(i)   ((uintptr_t)(i) & 0x)
 #define IMM_DS(i)  ((uintptr_t)(i) & 0xfffc)
+#define IMM_DQ(i)  ((uintptr_t)(i) & 0xfff0)
+#define IMM_D0(i)  (((uintptr_t)(i) >> 16) & 0x3)
+#define IMM_D1(i)  IMM_L(i)
 
 /*
  * 16-bit immediate helper macros: HA() is for use with sign-extending instrs
@@ -295,6 +298,8 @@
 #define __PPC_XB(b)b) & 0x1f) << 11) | (((b) & 0x20) >> 4))
 #define __PPC_XS(s)s) & 0x1f) << 21) | (((s) & 0x20) >> 5))
 #define __PPC_XT(s)__PPC_XS(s)
+#define __PPC_XSP(s)   s) & 0x1e) | (((s) >> 5) & 0x1)) << 21)
+#define __PPC_XTP(s)   __PPC_XSP(s)
 #define __PPC_T_TLB(t) (((t) & 0x3) << 21)
 #define __PPC_WC(w)(((w) & 0x3) << 21)
 #define __PPC_WS(w)(((w) & 0x1f) << 11)
@@ -395,6 +400,14 @@
 #define PPC_RAW_XVCPSGNDP(t, a, b) ((0xf780 | VSX_XX3((t), (a), (b
 #define PPC_RAW_VPERMXOR(vrt, vra, vrb, vrc) \
((0x102d | ___PPC_RT(vrt) | ___PPC_RA(vra) | ___PPC_RB(vrb) | 
(((vrc) & 0x1f) << 6)))
+#define PPC_RAW_LXVP(xtp, a, i)(0x1800 | __PPC_XTP(xtp) | 
___PPC_RA(a) | IMM_DQ(i))
+#define PPC_RAW_STXVP(xsp, a, i)   (0x1801 | __PPC_XSP(xsp) | 
___PPC_RA(a) | IMM_DQ(i))
+#define PPC_RAW_LXVPX(xtp, a, b)   (0x7c00029a | __PPC_XTP(xtp) | 
___PPC_RA(a) | ___PPC_RB(b))
+#define PPC_RAW_STXVPX(xsp, a, b)  (0x7c00039a | __PPC_XSP(xsp) | 
___PPC_RA(a) | ___PPC_RB(b))
+#define PPC_RAW_PLXVP(xtp, i, a, pr) \
+   ((PPC_PREFIX_8LS | __PPC_PRFX_R(pr) | IMM_D0(i)) << 32 | (0xe800 | 
__PPC_XTP(xtp) | ___PPC_RA(a) | IMM_D1(i)))
+#define PPC_RAW_PSTXVP(xsp, i, a, pr) \
+   ((PPC_PREFIX_8LS | __PPC_PRFX_R(pr) | IMM_D0(i)) << 32 | (0xf800 | 
__PPC_XSP(xsp) | ___PPC_RA(a) | IMM_D1(i)))
 #define PPC_RAW_NAP(0x4c000364)
 #define PPC_RAW_SLEEP  (0x4c0003a4)
 #define PPC_RAW_WINKLE (0x4c0003e4)
-- 
2.26.2



[PATCH v4 2/4] powerpc/sstep: Support VSX vector paired storage access instructions

2020-10-08 Thread Ravi Bangoria
From: Balamuruhan S 

VSX Vector Paired instructions loads/stores an octword (32 bytes)
from/to storage into two sequential VSRs. Add emulation support
for these new instructions:
  * Load VSX Vector Paired (lxvp)
  * Load VSX Vector Paired Indexed (lxvpx)
  * Prefixed Load VSX Vector Paired (plxvp)
  * Store VSX Vector Paired (stxvp)
  * Store VSX Vector Paired Indexed (stxvpx)
  * Prefixed Store VSX Vector Paired (pstxvp)

Suggested-by: Naveen N. Rao 
Signed-off-by: Balamuruhan S 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/lib/sstep.c | 146 +--
 1 file changed, 125 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index e6242744c71b..e39ee1651636 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -32,6 +32,10 @@ extern char system_call_vectored_emulate[];
 #define XER_OV32   0x0008U
 #define XER_CA32   0x0004U
 
+#ifdef CONFIG_VSX
+#define VSX_REGISTER_XTP(rd)   rd) & 1) << 5) | ((rd) & 0xfe))
+#endif
+
 #ifdef CONFIG_PPC_FPU
 /*
  * Functions in ldstfp.S
@@ -279,6 +283,19 @@ static nokprobe_inline void do_byte_reverse(void *ptr, int 
nb)
up[1] = tmp;
break;
}
+   case 32: {
+   unsigned long *up = (unsigned long *)ptr;
+   unsigned long tmp;
+
+   tmp = byterev_8(up[0]);
+   up[0] = byterev_8(up[3]);
+   up[3] = tmp;
+   tmp = byterev_8(up[2]);
+   up[2] = byterev_8(up[1]);
+   up[1] = tmp;
+   break;
+   }
+
 #endif
default:
WARN_ON_ONCE(1);
@@ -709,6 +726,8 @@ void emulate_vsx_load(struct instruction_op *op, union 
vsx_reg *reg,
reg->d[0] = reg->d[1] = 0;
 
switch (op->element_size) {
+   case 32:
+   /* [p]lxvp[x] */
case 16:
/* whole vector; lxv[x] or lxvl[l] */
if (size == 0)
@@ -717,7 +736,7 @@ void emulate_vsx_load(struct instruction_op *op, union 
vsx_reg *reg,
if (IS_LE && (op->vsx_flags & VSX_LDLEFT))
rev = !rev;
if (rev)
-   do_byte_reverse(reg, 16);
+   do_byte_reverse(reg, size);
break;
case 8:
/* scalar loads, lxvd2x, lxvdsx */
@@ -793,6 +812,20 @@ void emulate_vsx_store(struct instruction_op *op, const 
union vsx_reg *reg,
size = GETSIZE(op->type);
 
switch (op->element_size) {
+   case 32:
+   /* [p]stxvp[x] */
+   if (size == 0)
+   break;
+   if (rev) {
+   /* reverse 32 bytes */
+   buf.d[0] = byterev_8(reg->d[3]);
+   buf.d[1] = byterev_8(reg->d[2]);
+   buf.d[2] = byterev_8(reg->d[1]);
+   buf.d[3] = byterev_8(reg->d[0]);
+   reg = 
+   }
+   memcpy(mem, reg, size);
+   break;
case 16:
/* stxv, stxvx, stxvl, stxvll */
if (size == 0)
@@ -861,28 +894,43 @@ static nokprobe_inline int do_vsx_load(struct 
instruction_op *op,
   bool cross_endian)
 {
int reg = op->reg;
-   u8 mem[16];
-   union vsx_reg buf;
+   int i, j, nr_vsx_regs;
+   u8 mem[32];
+   union vsx_reg buf[2];
int size = GETSIZE(op->type);
 
if (!address_ok(regs, ea, size) || copy_mem_in(mem, ea, size, regs))
return -EFAULT;
 
-   emulate_vsx_load(op, , mem, cross_endian);
+   nr_vsx_regs = size / sizeof(__vector128);
+   emulate_vsx_load(op, buf, mem, cross_endian);
preempt_disable();
if (reg < 32) {
/* FP regs + extensions */
if (regs->msr & MSR_FP) {
-   load_vsrn(reg, );
+   for (i = 0; i < nr_vsx_regs; i++) {
+   j = IS_LE ? nr_vsx_regs - i - 1 : i;
+   load_vsrn(reg + i, [j].v);
+   }
} else {
-   current->thread.fp_state.fpr[reg][0] = buf.d[0];
-   current->thread.fp_state.fpr[reg][1] = buf.d[1];
+   for (i = 0; i < nr_vsx_regs; i++) {
+   j = IS_LE ? nr_vsx_regs - i - 1 : i;
+   current->thread.fp_state.fpr[reg + i][0] = 
buf[j].d[0];
+   current->thread.fp_state.fpr[reg + i][1] = 
buf[j].d[1];
+   }
}
} else {
-   if (regs->msr & MSR_VEC)
-   load_vsrn(reg, );
-   else
-   current->thread.vr_state.vr

[PATCH v4 0/4] powerpc/sstep: VSX 32-byte vector paired load/store instructions

2020-10-08 Thread Ravi Bangoria
VSX vector paired instructions operates with octword (32-byte)
operand for loads and stores between storage and a pair of two
sequential Vector-Scalar Registers (VSRs). There are 4 word
instructions and 2 prefixed instructions that provides this
32-byte storage access operations - lxvp, lxvpx, stxvp, stxvpx,
plxvp, pstxvp.

Emulation infrastructure doesn't have support for these instructions,
to operate with 32-byte storage access and to operate with 2 VSX
registers. This patch series enables the instruction emulation
support and adds test cases for them respectively.

v3: 
https://lore.kernel.org/linuxppc-dev/20200731081637.1837559-1-bal...@linux.ibm.com/
 

Changes in v4:
-
* Patch #1 is (kind of) new.
* Patch #2 now enables both analyse_instr() and emulate_step()
  unlike prev series where both were in separate patches.
* Patch #2 also has important fix for emulation on LE.
* Patch #3 and #4. Added XSP/XTP, D0/D1 instruction operands,
  removed *_EX_OP, __PPC_T[P][X] macros which are incorrect,
  and adhered to PPC_RAW_* convention.
* Added `CPU_FTR_ARCH_31` check in testcases to avoid failing
  in p8/p9.
* Some consmetic changes.
* Rebased to powerpc/next

Changes in v3:
-
Worked on review comments and suggestions from Ravi and Naveen,

* Fix the do_vsx_load() to handle vsx instructions if MSR_FP/MSR_VEC
  cleared in exception conditions and it reaches to read/write to
  thread_struct member fp_state/vr_state respectively.
* Fix wrongly used `__vector128 v[2]` in struct vsx_reg as it should
  hold a single vsx register size.
* Remove unnecessary `VSX_CHECK_VEC` flag set and condition to check
  `VSX_LDLEFT` that is not applicable for these vsx instructions.
* Fix comments in emulate_vsx_load() that were misleading.
* Rebased on latest powerpc next branch.

Changes in v2:
-
* Fix suggestion from Sandipan, wrap ISA 3.1 instructions with
  cpu_has_feature(CPU_FTR_ARCH_31) check.
* Rebase on latest powerpc next branch.


Balamuruhan S (4):
  powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31
is set
  powerpc/sstep: Support VSX vector paired storage access instructions
  powerpc/ppc-opcode: Add encoding macros for VSX vector paired
instructions
  powerpc/sstep: Add testcases for VSX vector paired load/store
instructions

 arch/powerpc/include/asm/ppc-opcode.h |  13 ++
 arch/powerpc/lib/sstep.c  | 152 +--
 arch/powerpc/lib/test_emulate_step.c  | 270 ++
 3 files changed, 414 insertions(+), 21 deletions(-)

-- 
2.26.2



[PATCH v4 1/4] powerpc/sstep: Emulate prefixed instructions only when CPU_FTR_ARCH_31 is set

2020-10-08 Thread Ravi Bangoria
From: Balamuruhan S 

Unconditional emulation of prefixed instructions will allow
emulation of them on Power10 predecessors which might cause
issues. Restrict that.

Signed-off-by: Balamuruhan S 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/lib/sstep.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index e9dcaba9a4f8..e6242744c71b 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1346,6 +1346,9 @@ int analyse_instr(struct instruction_op *op, const struct 
pt_regs *regs,
switch (opcode) {
 #ifdef __powerpc64__
case 1:
+   if (!cpu_has_feature(CPU_FTR_ARCH_31))
+   return -1;
+
prefix_r = GET_PREFIX_R(word);
ra = GET_PREFIX_RA(suffix);
rd = (suffix >> 21) & 0x1f;
@@ -2733,6 +2736,9 @@ int analyse_instr(struct instruction_op *op, const struct 
pt_regs *regs,
}
break;
case 1: /* Prefixed instructions */
+   if (!cpu_has_feature(CPU_FTR_ARCH_31))
+   return -1;
+
prefix_r = GET_PREFIX_R(word);
ra = GET_PREFIX_RA(suffix);
op->update_reg = ra;
-- 
2.26.2



Re: [PATCH v6 0/8] powerpc/watchpoint: Bug fixes plus new feature flag

2020-09-18 Thread Ravi Bangoria




On 9/17/20 6:54 PM, Rogerio Alves wrote:

On 9/2/20 1:29 AM, Ravi Bangoria wrote:

Patch #1 fixes issue for quardword instruction on p10 predecessors.
Patch #2 fixes issue for vector instructions.
Patch #3 fixes a bug about watchpoint not firing when created with
  ptrace PPC_PTRACE_SETHWDEBUG and CONFIG_HAVE_HW_BREAKPOINT=N.
  The fix uses HW_BRK_TYPE_PRIV_ALL for ptrace user which, I
  guess, should be fine because we don't leak any kernel
  addresses and PRIV_ALL will also help to cover scenarios when
  kernel accesses user memory.
Patch #4,#5 fixes infinite exception bug, again the bug happens only
  with CONFIG_HAVE_HW_BREAKPOINT=N.
Patch #6 fixes two places where we are missing to set hw_len.
Patch #7 introduce new feature bit PPC_DEBUG_FEATURE_DATA_BP_ARCH_31
  which will be set when running on ISA 3.1 compliant machine.
Patch #8 finally adds selftest to test scenarios fixed by patch#2,#3
  and also moves MODE_EXACT tests outside of BP_RANGE condition.


[...]



Tested this patch set for:
- SETHWDEBUG when CONFIG_HAVE_HW_BREAKPOINT=N = OK
- Fix exception handling for CONFIG_HAVE_HW_BREAKPOINT=N = OK
- Check for PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 = OK
- Fix quarword instruction handling on p10 predecessors = OK
- Fix handling of vector instructions = OK

Also tested for:
- Set second watchpoint (P10 Mambo) = OK
- Infinity loop on sc instruction = OK


Thanks Rogerio!

Ravi


Re: [PATCH 0/7] powerpc/watchpoint: 2nd DAWR kvm enablement + selftests

2020-09-01 Thread Ravi Bangoria

Hi Paul,

On 9/2/20 8:02 AM, Paul Mackerras wrote:

On Thu, Jul 23, 2020 at 03:50:51PM +0530, Ravi Bangoria wrote:

Patch #1, #2 and #3 enables p10 2nd DAWR feature for Book3S kvm guest. DAWR
is a hypervisor resource and thus H_SET_MODE hcall is used to set/unset it.
A new case H_SET_MODE_RESOURCE_SET_DAWR1 is introduced in H_SET_MODE hcall
for setting/unsetting 2nd DAWR. Also, new capability KVM_CAP_PPC_DAWR1 has
been added to query 2nd DAWR support via kvm ioctl.

This feature also needs to be enabled in Qemu to really use it. I'll reply
link to qemu patches once I post them in qemu-devel mailing list.

Patch #4, #5, #6 and #7 adds selftests to test 2nd DAWR.


If/when you resubmit these patches, please split the KVM patches into
a separate series, since the KVM patches would go via my tree whereas
I expect the selftests/powerpc patches would go through Michael
Ellerman's tree.


Sure. Will split it.

Thanks,
Ravi


Re: [PATCH 2/7] powerpc/watchpoint/kvm: Add infrastructure to support 2nd DAWR

2020-09-01 Thread Ravi Bangoria

Hi Paul,


diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index 33793444144c..03f401d7be41 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -538,6 +538,8 @@ struct hv_guest_state {
s64 tb_offset;
u64 dawr0;
u64 dawrx0;
+   u64 dawr1;
+   u64 dawrx1;
u64 ciabr;
u64 hdec_expiry;
u64 purr;


After this struct, there is a macro HV_GUEST_STATE_VERSION, I guess that
also needs to be incremented because I'm adding new members in the struct?

Thanks,
Ravi


Re: [PATCH 1/7] powerpc/watchpoint/kvm: Rename current DAWR macros and variables

2020-09-01 Thread Ravi Bangoria

Hi Paul,

On 9/2/20 7:19 AM, Paul Mackerras wrote:

On Thu, Jul 23, 2020 at 03:50:52PM +0530, Ravi Bangoria wrote:

Power10 is introducing second DAWR. Use real register names (with
suffix 0) from ISA for current macros and variables used by kvm.


Most of this looks fine, but I think we should not change the existing
names in arch/powerpc/include/uapi/asm/kvm.h (and therefore also
Documentation/virt/kvm/api.rst).


Missed that I'm changing uapi. I'll rename only those macros/variables
which are not uapi.

Thanks,
Ravi


Re: [PATCH 2/7] powerpc/watchpoint/kvm: Add infrastructure to support 2nd DAWR

2020-09-01 Thread Ravi Bangoria

Hi Paul,

On 9/2/20 7:31 AM, Paul Mackerras wrote:

On Thu, Jul 23, 2020 at 03:50:53PM +0530, Ravi Bangoria wrote:

kvm code assumes single DAWR everywhere. Add code to support 2nd DAWR.
DAWR is a hypervisor resource and thus H_SET_MODE hcall is used to set/
unset it. Introduce new case H_SET_MODE_RESOURCE_SET_DAWR1 for 2nd DAWR.


Is this the same interface as will be defined in PAPR and available
under PowerVM, or is it a new/different interface for KVM?


Yes, kvm hcall interface for 2nd DAWR is same as PowerVM, as defined in PAPR.




Also, kvm will support 2nd DAWR only if CPU_FTR_DAWR1 is set.


In general QEMU wants to be able to control all aspects of the virtual
machine presented to the guest, meaning that just because a host has a
particular hardware capability does not mean we should automatically
present that capability to the guest.

In this case, QEMU will want a way to control whether the guest sees
the availability of the second DAWR/X registers or not, i.e. whether a
H_SET_MODE to set DAWR[X]1 will succeed or fail.


Patch #3 adds new kvm capability KVM_CAP_PPC_DAWR1 that can be checked
by Qemu. Also, as suggested by David in Qemu patch[1], I'm planning to
add new machine capability in Qemu:

  -machine cap-dawr1=ON/OFF

cap-dawr1 will be default ON when PPC_FEATURE2_ARCH_3_10 is set and OFF
otherwise.

Is this correct approach?

[1]: https://lore.kernel.org/kvm/20200724045613.ga8...@umbus.fritz.box

Thanks,
Ravi


[PATCH v6 8/8] powerpc/watchpoint/selftests: Tests for kernel accessing user memory

2020-09-01 Thread Ravi Bangoria
Introduce tests to cover simple scenarios where user is watching
memory which can be accessed by kernel as well. We also support
_MODE_EXACT with _SETHWDEBUG interface. Move those testcases out-
side of _BP_RANGE condition. This will help to test _MODE_EXACT
scenarios when CONFIG_HAVE_HW_BREAKPOINT is not set, eg:

  $ ./ptrace-hwbreak
  ...
  PTRACE_SET_DEBUGREG, Kernel Access Userspace, len: 8: Ok
  PPC_PTRACE_SETHWDEBUG, MODE_EXACT, WO, len: 1: Ok
  PPC_PTRACE_SETHWDEBUG, MODE_EXACT, RO, len: 1: Ok
  PPC_PTRACE_SETHWDEBUG, MODE_EXACT, RW, len: 1: Ok
  PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace, len: 1: Ok
  success: ptrace-hwbreak

Suggested-by: Pedro Miraglia Franco de Carvalho 
Signed-off-by: Ravi Bangoria 
---
 .../selftests/powerpc/ptrace/ptrace-hwbreak.c | 48 ++-
 1 file changed, 46 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
index fc477dfe86a2..2e0d86e0687e 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include "ptrace.h"
 
 #define SPRN_PVR   0x11F
@@ -44,6 +46,7 @@ struct gstruct {
 };
 static volatile struct gstruct gstruct __attribute__((aligned(512)));
 
+static volatile char cwd[PATH_MAX] __attribute__((aligned(8)));
 
 static void get_dbginfo(pid_t child_pid, struct ppc_debug_info *dbginfo)
 {
@@ -138,6 +141,9 @@ static void test_workload(void)
write_var(len);
}
 
+   /* PTRACE_SET_DEBUGREG, Kernel Access Userspace test */
+   syscall(__NR_getcwd, , PATH_MAX);
+
/* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, WO test */
write_var(1);
 
@@ -150,6 +156,9 @@ static void test_workload(void)
else
read_var(1);
 
+   /* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace test */
+   syscall(__NR_getcwd, , PATH_MAX);
+
/* PPC_PTRACE_SETHWDEBUG, MODE_RANGE, DW ALIGNED, WO test */
gstruct.a[rand() % A_LEN] = 'a';
 
@@ -293,6 +302,24 @@ static int test_set_debugreg(pid_t child_pid)
return 0;
 }
 
+static int test_set_debugreg_kernel_userspace(pid_t child_pid)
+{
+   unsigned long wp_addr = (unsigned long)cwd;
+   char *name = "PTRACE_SET_DEBUGREG";
+
+   /* PTRACE_SET_DEBUGREG, Kernel Access Userspace test */
+   wp_addr &= ~0x7UL;
+   wp_addr |= (1Ul << DABR_READ_SHIFT);
+   wp_addr |= (1UL << DABR_WRITE_SHIFT);
+   wp_addr |= (1UL << DABR_TRANSLATION_SHIFT);
+   ptrace_set_debugreg(child_pid, wp_addr);
+   ptrace(PTRACE_CONT, child_pid, NULL, 0);
+   check_success(child_pid, name, "Kernel Access Userspace", wp_addr, 8);
+
+   ptrace_set_debugreg(child_pid, 0);
+   return 0;
+}
+
 static void get_ppc_hw_breakpoint(struct ppc_hw_breakpoint *info, int type,
  unsigned long addr, int len)
 {
@@ -338,6 +365,22 @@ static void test_sethwdebug_exact(pid_t child_pid)
ptrace_delhwdebug(child_pid, wh);
 }
 
+static void test_sethwdebug_exact_kernel_userspace(pid_t child_pid)
+{
+   struct ppc_hw_breakpoint info;
+   unsigned long wp_addr = (unsigned long)
+   char *name = "PPC_PTRACE_SETHWDEBUG, MODE_EXACT";
+   int len = 1; /* hardcoded in kernel */
+   int wh;
+
+   /* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace test */
+   get_ppc_hw_breakpoint(, PPC_BREAKPOINT_TRIGGER_WRITE, wp_addr, 0);
+   wh = ptrace_sethwdebug(child_pid, );
+   ptrace(PTRACE_CONT, child_pid, NULL, 0);
+   check_success(child_pid, name, "Kernel Access Userspace", wp_addr, len);
+   ptrace_delhwdebug(child_pid, wh);
+}
+
 static void test_sethwdebug_range_aligned(pid_t child_pid)
 {
struct ppc_hw_breakpoint info;
@@ -452,9 +495,10 @@ static void
 run_tests(pid_t child_pid, struct ppc_debug_info *dbginfo, bool dawr)
 {
test_set_debugreg(child_pid);
+   test_set_debugreg_kernel_userspace(child_pid);
+   test_sethwdebug_exact(child_pid);
+   test_sethwdebug_exact_kernel_userspace(child_pid);
if (dbginfo->features & PPC_DEBUG_FEATURE_DATA_BP_RANGE) {
-   test_sethwdebug_exact(child_pid);
-
test_sethwdebug_range_aligned(child_pid);
if (dawr || is_8xx) {
test_sethwdebug_range_unaligned(child_pid);
-- 
2.26.2



[PATCH v6 7/8] powerpc/watchpoint/ptrace: Introduce PPC_DEBUG_FEATURE_DATA_BP_ARCH_31

2020-09-01 Thread Ravi Bangoria
PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 can be used to determine whether
we are running on an ISA 3.1 compliant machine. Which is needed to
determine DAR behaviour, 512 byte boundary limit etc. This was
requested by Pedro Miraglia Franco de Carvalho for extending
watchpoint features in gdb. Note that availability of 2nd DAWR is
independent of this flag and should be checked using
ppc_debug_info->num_data_bps.

Signed-off-by: Ravi Bangoria 
---
 Documentation/powerpc/ptrace.rst  | 1 +
 arch/powerpc/include/uapi/asm/ptrace.h| 1 +
 arch/powerpc/kernel/ptrace/ptrace-noadv.c | 2 ++
 3 files changed, 4 insertions(+)

diff --git a/Documentation/powerpc/ptrace.rst b/Documentation/powerpc/ptrace.rst
index 864d4b61..77725d69eb4a 100644
--- a/Documentation/powerpc/ptrace.rst
+++ b/Documentation/powerpc/ptrace.rst
@@ -46,6 +46,7 @@ features will have bits indicating whether there is support 
for::
   #define PPC_DEBUG_FEATURE_DATA_BP_RANGE  0x4
   #define PPC_DEBUG_FEATURE_DATA_BP_MASK   0x8
   #define PPC_DEBUG_FEATURE_DATA_BP_DAWR   0x10
+  #define PPC_DEBUG_FEATURE_DATA_BP_ARCH_310x20
 
 2. PTRACE_SETHWDEBUG
 
diff --git a/arch/powerpc/include/uapi/asm/ptrace.h 
b/arch/powerpc/include/uapi/asm/ptrace.h
index f5f1ccc740fc..7004cfea3f5f 100644
--- a/arch/powerpc/include/uapi/asm/ptrace.h
+++ b/arch/powerpc/include/uapi/asm/ptrace.h
@@ -222,6 +222,7 @@ struct ppc_debug_info {
 #define PPC_DEBUG_FEATURE_DATA_BP_RANGE0x0004
 #define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x0008
 #define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x0010
+#define PPC_DEBUG_FEATURE_DATA_BP_ARCH_31  0x0020
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index 48c52426af80..aa36fcad36cd 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -57,6 +57,8 @@ void ppc_gethwdinfo(struct ppc_debug_info *dbginfo)
} else {
dbginfo->features = 0;
}
+   if (cpu_has_feature(CPU_FTR_ARCH_31))
+   dbginfo->features |= PPC_DEBUG_FEATURE_DATA_BP_ARCH_31;
 }
 
 int ptrace_get_debugreg(struct task_struct *child, unsigned long addr,
-- 
2.26.2



[PATCH v6 6/8] powerpc/watchpoint: Add hw_len wherever missing

2020-09-01 Thread Ravi Bangoria
There are couple of places where we set len but not hw_len. For
ptrace/perf watchpoints, when CONFIG_HAVE_HW_BREAKPOINT=Y, hw_len
will be calculated and set internally while parsing watchpoint.
But when CONFIG_HAVE_HW_BREAKPOINT=N, we need to manually set
'hw_len'. Similarly for xmon as well, hw_len needs to be set
directly.

Fixes: b57aeab811db ("powerpc/watchpoint: Fix length calculation for unaligned 
target")
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/kernel/ptrace/ptrace-noadv.c | 1 +
 arch/powerpc/xmon/xmon.c  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index c9122ed91340..48c52426af80 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -219,6 +219,7 @@ long ppc_set_hwdebug(struct task_struct *child, struct 
ppc_hw_breakpoint *bp_inf
brk.address = ALIGN_DOWN(bp_info->addr, HW_BREAKPOINT_SIZE);
brk.type = HW_BRK_TYPE_TRANSLATE | HW_BRK_TYPE_PRIV_ALL;
brk.len = DABR_MAX_LEN;
+   brk.hw_len = DABR_MAX_LEN;
if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_READ)
brk.type |= HW_BRK_TYPE_READ;
if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_WRITE)
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index df7bca00f5ec..55c43a6c9111 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -969,6 +969,7 @@ static void insert_cpu_bpts(void)
brk.address = dabr[i].address;
brk.type = (dabr[i].enabled & HW_BRK_TYPE_DABR) | 
HW_BRK_TYPE_PRIV_ALL;
brk.len = 8;
+   brk.hw_len = 8;
__set_breakpoint(i, );
}
}
-- 
2.26.2



[PATCH v6 5/8] powerpc/watchpoint: Fix exception handling for CONFIG_HAVE_HW_BREAKPOINT=N

2020-09-01 Thread Ravi Bangoria
On powerpc, ptrace watchpoint works in one-shot mode. i.e. kernel
disables event every time it fires and user has to re-enable it.
Also, in case of ptrace watchpoint, kernel notifies ptrace user
before executing instruction.

With CONFIG_HAVE_HW_BREAKPOINT=N, kernel is missing to disable
ptrace event and thus it's causing infinite loop of exceptions.
This is especially harmful when user watches on a data which is
also read/written by kernel, eg syscall parameters. In such case,
infinite exceptions happens in kernel mode which causes soft-lockup.

Fixes: 9422de3e953d ("powerpc: Hardware breakpoints rewrite to handle non DABR 
breakpoint registers")
Reported-by: Pedro Miraglia Franco de Carvalho 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/hw_breakpoint.h  |  3 ++
 arch/powerpc/kernel/process.c | 48 +++
 arch/powerpc/kernel/ptrace/ptrace-noadv.c |  4 +-
 3 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/hw_breakpoint.h 
b/arch/powerpc/include/asm/hw_breakpoint.h
index 81872c420476..abebfbee5b1c 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -18,6 +18,7 @@ struct arch_hw_breakpoint {
u16 type;
u16 len; /* length of the target data symbol */
u16 hw_len; /* length programmed in hw */
+   u8  flags;
 };
 
 /* Note: Don't change the first 6 bits below as they are in the same order
@@ -37,6 +38,8 @@ struct arch_hw_breakpoint {
 #define HW_BRK_TYPE_PRIV_ALL   (HW_BRK_TYPE_USER | HW_BRK_TYPE_KERNEL | \
 HW_BRK_TYPE_HYP)
 
+#define HW_BRK_FLAG_DISABLED   0x1
+
 /* Minimum granularity */
 #ifdef CONFIG_PPC_8xx
 #define HW_BREAKPOINT_SIZE  0x4
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 016bd831908e..160fbbf41d40 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -636,6 +636,44 @@ void do_send_trap(struct pt_regs *regs, unsigned long 
address,
(void __user *)address);
 }
 #else  /* !CONFIG_PPC_ADV_DEBUG_REGS */
+
+static void do_break_handler(struct pt_regs *regs)
+{
+   struct arch_hw_breakpoint null_brk = {0};
+   struct arch_hw_breakpoint *info;
+   struct ppc_inst instr = ppc_inst(0);
+   int type = 0;
+   int size = 0;
+   unsigned long ea;
+   int i;
+
+   /*
+* If underneath hw supports only one watchpoint, we know it
+* caused exception. 8xx also falls into this category.
+*/
+   if (nr_wp_slots() == 1) {
+   __set_breakpoint(0, _brk);
+   current->thread.hw_brk[0] = null_brk;
+   current->thread.hw_brk[0].flags |= HW_BRK_FLAG_DISABLED;
+   return;
+   }
+
+   /* Otherwise findout which DAWR caused exception and disable it. */
+   wp_get_instr_detail(regs, , , , );
+
+   for (i = 0; i < nr_wp_slots(); i++) {
+   info = >thread.hw_brk[i];
+   if (!info->address)
+   continue;
+
+   if (wp_check_constraints(regs, instr, ea, type, size, info)) {
+   __set_breakpoint(i, _brk);
+   current->thread.hw_brk[i] = null_brk;
+   current->thread.hw_brk[i].flags |= HW_BRK_FLAG_DISABLED;
+   }
+   }
+}
+
 void do_break (struct pt_regs *regs, unsigned long address,
unsigned long error_code)
 {
@@ -647,6 +685,16 @@ void do_break (struct pt_regs *regs, unsigned long address,
if (debugger_break_match(regs))
return;
 
+   /*
+* We reach here only when watchpoint exception is generated by ptrace
+* event (or hw is buggy!). Now if CONFIG_HAVE_HW_BREAKPOINT is set,
+* watchpoint is already handled by hw_breakpoint_handler() so we don't
+* have to do anything. But when CONFIG_HAVE_HW_BREAKPOINT is not set,
+* we need to manually handle the watchpoint here.
+*/
+   if (!IS_ENABLED(CONFIG_HAVE_HW_BREAKPOINT))
+   do_break_handler(regs);
+
/* Deliver the signal to userspace */
force_sig_fault(SIGTRAP, TRAP_HWBKPT, (void __user *)address);
 }
diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index 57a0ab822334..c9122ed91340 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -286,11 +286,13 @@ long ppc_del_hwdebug(struct task_struct *child, long data)
}
return ret;
 #else /* CONFIG_HAVE_HW_BREAKPOINT */
-   if (child->thread.hw_brk[data - 1].address == 0)
+   if (!(child->thread.hw_brk[data - 1].flags & HW_BRK_FLAG_DISABLED) &&
+   child->thread.hw_brk[data - 1].address == 0)
return -ENOENT;
 
child

[PATCH v6 4/8] powerpc/watchpoint: Move DAWR detection logic outside of hw_breakpoint.c

2020-09-01 Thread Ravi Bangoria
Power10 hw has multiple DAWRs but hw doesn't tell which DAWR caused
the exception. So we have a sw logic to detect that in hw_breakpoint.c.
But hw_breakpoint.c gets compiled only with CONFIG_HAVE_HW_BREAKPOINT=Y.
Move DAWR detection logic outside of hw_breakpoint.c so that it can be
reused when CONFIG_HAVE_HW_BREAKPOINT is not set.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/hw_breakpoint.h  |   8 +
 arch/powerpc/kernel/Makefile  |   3 +-
 arch/powerpc/kernel/hw_breakpoint.c   | 159 +
 .../kernel/hw_breakpoint_constraints.c| 162 ++
 4 files changed, 174 insertions(+), 158 deletions(-)
 create mode 100644 arch/powerpc/kernel/hw_breakpoint_constraints.c

diff --git a/arch/powerpc/include/asm/hw_breakpoint.h 
b/arch/powerpc/include/asm/hw_breakpoint.h
index 9b68eafebf43..81872c420476 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -10,6 +10,7 @@
 #define _PPC_BOOK3S_64_HW_BREAKPOINT_H
 
 #include 
+#include 
 
 #ifdef __KERNEL__
 struct arch_hw_breakpoint {
@@ -52,6 +53,13 @@ static inline int nr_wp_slots(void)
return cpu_has_feature(CPU_FTR_DAWR1) ? 2 : 1;
 }
 
+bool wp_check_constraints(struct pt_regs *regs, struct ppc_inst instr,
+ unsigned long ea, int type, int size,
+ struct arch_hw_breakpoint *info);
+
+void wp_get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
+int *type, int *size, unsigned long *ea);
+
 #ifdef CONFIG_HAVE_HW_BREAKPOINT
 #include 
 #include 
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index cbf41fb4ee89..a5550c2b24c4 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -45,7 +45,8 @@ obj-y := cputable.o syscalls.o \
   signal.o sysfs.o cacheinfo.o time.o \
   prom.o traps.o setup-common.o \
   udbg.o misc.o io.o misc_$(BITS).o \
-  of_platform.o prom_parse.o firmware.o
+  of_platform.o prom_parse.o firmware.o \
+  hw_breakpoint_constraints.o
 obj-y  += ptrace/
 obj-$(CONFIG_PPC64)+= setup_64.o \
   paca.o nvram_64.o note.o syscall_64.o
diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
b/arch/powerpc/kernel/hw_breakpoint.c
index f6b24838ca3c..f4e8f21046f5 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -494,161 +494,6 @@ void thread_change_pc(struct task_struct *tsk, struct 
pt_regs *regs)
}
 }
 
-static bool dar_in_user_range(unsigned long dar, struct arch_hw_breakpoint 
*info)
-{
-   return ((info->address <= dar) && (dar - info->address < info->len));
-}
-
-static bool ea_user_range_overlaps(unsigned long ea, int size,
-  struct arch_hw_breakpoint *info)
-{
-   return ((ea < info->address + info->len) &&
-   (ea + size > info->address));
-}
-
-static bool dar_in_hw_range(unsigned long dar, struct arch_hw_breakpoint *info)
-{
-   unsigned long hw_start_addr, hw_end_addr;
-
-   hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE);
-   hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE);
-
-   return ((hw_start_addr <= dar) && (hw_end_addr > dar));
-}
-
-static bool ea_hw_range_overlaps(unsigned long ea, int size,
-struct arch_hw_breakpoint *info)
-{
-   unsigned long hw_start_addr, hw_end_addr;
-   unsigned long align_size = HW_BREAKPOINT_SIZE;
-
-   /*
-* On p10 predecessors, quadword is handle differently then
-* other instructions.
-*/
-   if (!cpu_has_feature(CPU_FTR_ARCH_31) && size == 16)
-   align_size = HW_BREAKPOINT_SIZE_QUADWORD;
-
-   hw_start_addr = ALIGN_DOWN(info->address, align_size);
-   hw_end_addr = ALIGN(info->address + info->len, align_size);
-
-   return ((ea < hw_end_addr) && (ea + size > hw_start_addr));
-}
-
-/*
- * If hw has multiple DAWR registers, we also need to check all
- * dawrx constraint bits to confirm this is _really_ a valid event.
- * If type is UNKNOWN, but privilege level matches, consider it as
- * a positive match.
- */
-static bool check_dawrx_constraints(struct pt_regs *regs, int type,
-   struct arch_hw_breakpoint *info)
-{
-   if (OP_IS_LOAD(type) && !(info->type & HW_BRK_TYPE_READ))
-   return false;
-
-   /*
-* The Cache Management instructions other than dcbz never
-* cause a match. i.e. if type is CACHEOP, the instruction
-* is dcbz, and dcbz is tr

[PATCH v6 3/8] powerpc/watchpoint/ptrace: Fix SETHWDEBUG when CONFIG_HAVE_HW_BREAKPOINT=N

2020-09-01 Thread Ravi Bangoria
When kernel is compiled with CONFIG_HAVE_HW_BREAKPOINT=N, user can
still create watchpoint using PPC_PTRACE_SETHWDEBUG, with limited
functionalities. But, such watchpoints are never firing because of
the missing privilege settings. Fix that.

It's safe to set HW_BRK_TYPE_PRIV_ALL because we don't really leak
any kernel address in signal info. Setting HW_BRK_TYPE_PRIV_ALL will
also help to find scenarios when kernel accesses user memory.

Reported-by: Pedro Miraglia Franco de Carvalho 
Suggested-by: Pedro Miraglia Franco de Carvalho 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/kernel/ptrace/ptrace-noadv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index 697c7e4b5877..57a0ab822334 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -217,7 +217,7 @@ long ppc_set_hwdebug(struct task_struct *child, struct 
ppc_hw_breakpoint *bp_inf
return -EIO;
 
brk.address = ALIGN_DOWN(bp_info->addr, HW_BREAKPOINT_SIZE);
-   brk.type = HW_BRK_TYPE_TRANSLATE;
+   brk.type = HW_BRK_TYPE_TRANSLATE | HW_BRK_TYPE_PRIV_ALL;
brk.len = DABR_MAX_LEN;
if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_READ)
brk.type |= HW_BRK_TYPE_READ;
-- 
2.26.2



[PATCH v6 2/8] powerpc/watchpoint: Fix handling of vector instructions

2020-09-01 Thread Ravi Bangoria
Vector load/store instructions are special because they are always
aligned. Thus unaligned EA needs to be aligned down before comparing
it with watch ranges. Otherwise we might consider valid event as
invalid.

Fixes: 74c6881019b7 ("powerpc/watchpoint: Prepare handler to handle more than 
one watchpoint")
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/kernel/hw_breakpoint.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
b/arch/powerpc/kernel/hw_breakpoint.c
index 9f7df1c37233..f6b24838ca3c 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -644,6 +644,8 @@ static void get_instr_detail(struct pt_regs *regs, struct 
ppc_inst *instr,
if (*type == CACHEOP) {
*size = cache_op_size();
*ea &= ~(*size - 1);
+   } else if (*type == LOAD_VMX || *type == STORE_VMX) {
+   *ea &= ~(*size - 1);
}
 }
 
-- 
2.26.2



[PATCH v6 1/8] powerpc/watchpoint: Fix quarword instruction handling on p10 predecessors

2020-09-01 Thread Ravi Bangoria
On p10 predecessors, watchpoint with quarword access is compared at
quardword length. If the watch range is doubleword or less than that
in a first half of quarword aligned 16 bytes, and if there is any
unaligned quadword access which will access only the 2nd half, the
handler should consider it as extraneous and emulate/single-step it
before continuing.

Reported-by: Pedro Miraglia Franco de Carvalho 
Fixes: 74c6881019b7 ("powerpc/watchpoint: Prepare handler to handle more than 
one watchpoint")
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/hw_breakpoint.h |  1 +
 arch/powerpc/kernel/hw_breakpoint.c  | 12 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/hw_breakpoint.h 
b/arch/powerpc/include/asm/hw_breakpoint.h
index db206a7f38e2..9b68eafebf43 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -42,6 +42,7 @@ struct arch_hw_breakpoint {
 #else
 #define HW_BREAKPOINT_SIZE  0x8
 #endif
+#define HW_BREAKPOINT_SIZE_QUADWORD0x10
 
 #define DABR_MAX_LEN   8
 #define DAWR_MAX_LEN   512
diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
b/arch/powerpc/kernel/hw_breakpoint.c
index 1f4a1efa0074..9f7df1c37233 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -520,9 +520,17 @@ static bool ea_hw_range_overlaps(unsigned long ea, int 
size,
 struct arch_hw_breakpoint *info)
 {
unsigned long hw_start_addr, hw_end_addr;
+   unsigned long align_size = HW_BREAKPOINT_SIZE;
 
-   hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE);
-   hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE);
+   /*
+* On p10 predecessors, quadword is handle differently then
+* other instructions.
+*/
+   if (!cpu_has_feature(CPU_FTR_ARCH_31) && size == 16)
+   align_size = HW_BREAKPOINT_SIZE_QUADWORD;
+
+   hw_start_addr = ALIGN_DOWN(info->address, align_size);
+   hw_end_addr = ALIGN(info->address + info->len, align_size);
 
return ((ea < hw_end_addr) && (ea + size > hw_start_addr));
 }
-- 
2.26.2



[PATCH v6 0/8] powerpc/watchpoint: Bug fixes plus new feature flag

2020-09-01 Thread Ravi Bangoria
Patch #1 fixes issue for quardword instruction on p10 predecessors.
Patch #2 fixes issue for vector instructions.
Patch #3 fixes a bug about watchpoint not firing when created with
 ptrace PPC_PTRACE_SETHWDEBUG and CONFIG_HAVE_HW_BREAKPOINT=N.
 The fix uses HW_BRK_TYPE_PRIV_ALL for ptrace user which, I
 guess, should be fine because we don't leak any kernel
 addresses and PRIV_ALL will also help to cover scenarios when
 kernel accesses user memory.
Patch #4,#5 fixes infinite exception bug, again the bug happens only
 with CONFIG_HAVE_HW_BREAKPOINT=N.
Patch #6 fixes two places where we are missing to set hw_len.
Patch #7 introduce new feature bit PPC_DEBUG_FEATURE_DATA_BP_ARCH_31
 which will be set when running on ISA 3.1 compliant machine.
Patch #8 finally adds selftest to test scenarios fixed by patch#2,#3
 and also moves MODE_EXACT tests outside of BP_RANGE condition.

Christophe, let me know if this series breaks something for 8xx.

v5: 
https://lore.kernel.org/r/20200825043617.1073634-1-ravi.bango...@linux.ibm.com

v5->v6:
 - Fix build faulure reported by kernel test robot
 - patch #5. Use more compact if condition, suggested by Christophe


Ravi Bangoria (8):
  powerpc/watchpoint: Fix quarword instruction handling on p10
predecessors
  powerpc/watchpoint: Fix handling of vector instructions
  powerpc/watchpoint/ptrace: Fix SETHWDEBUG when
CONFIG_HAVE_HW_BREAKPOINT=N
  powerpc/watchpoint: Move DAWR detection logic outside of
hw_breakpoint.c
  powerpc/watchpoint: Fix exception handling for
CONFIG_HAVE_HW_BREAKPOINT=N
  powerpc/watchpoint: Add hw_len wherever missing
  powerpc/watchpoint/ptrace: Introduce PPC_DEBUG_FEATURE_DATA_BP_ARCH_31
  powerpc/watchpoint/selftests: Tests for kernel accessing user memory

 Documentation/powerpc/ptrace.rst  |   1 +
 arch/powerpc/include/asm/hw_breakpoint.h  |  12 ++
 arch/powerpc/include/uapi/asm/ptrace.h|   1 +
 arch/powerpc/kernel/Makefile  |   3 +-
 arch/powerpc/kernel/hw_breakpoint.c   | 149 +---
 .../kernel/hw_breakpoint_constraints.c| 162 ++
 arch/powerpc/kernel/process.c |  48 ++
 arch/powerpc/kernel/ptrace/ptrace-noadv.c |   9 +-
 arch/powerpc/xmon/xmon.c  |   1 +
 .../selftests/powerpc/ptrace/ptrace-hwbreak.c |  48 +-
 10 files changed, 282 insertions(+), 152 deletions(-)
 create mode 100644 arch/powerpc/kernel/hw_breakpoint_constraints.c

-- 
2.26.2



Re: [PATCH v5 4/8] powerpc/watchpoint: Move DAWR detection logic outside of hw_breakpoint.c

2020-08-25 Thread Ravi Bangoria

Hi Christophe,


+static int cache_op_size(void)
+{
+#ifdef __powerpc64__
+    return ppc64_caches.l1d.block_size;
+#else
+    return L1_CACHE_BYTES;
+#endif
+}


You've got l1_dcache_bytes() in arch/powerpc/include/asm/cache.h to do that.


+
+void wp_get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
+ int *type, int *size, unsigned long *ea)
+{
+    struct instruction_op op;
+
+    if (__get_user_instr_inatomic(*instr, (void __user *)regs->nip))
+    return;
+
+    analyse_instr(, regs, *instr);
+    *type = GETTYPE(op.type);
+    *ea = op.ea;
+#ifdef __powerpc64__
+    if (!(regs->msr & MSR_64BIT))
+    *ea &= 0xUL;
+#endif


This #ifdef is unneeded, it should build fine on a 32 bits too.


This patch is just a code movement from one file to another.
I don't really change the logic. Would you mind if I do a
separate patch for these changes (not a part of this series)?

Thanks for review,
Ravi


Re: [PATCH v5 5/8] powerpc/watchpoint: Fix exception handling for CONFIG_HAVE_HW_BREAKPOINT=N

2020-08-25 Thread Ravi Bangoria

Hi Christophe,


diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index 57a0ab822334..866597b407bc 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -286,11 +286,16 @@ long ppc_del_hwdebug(struct task_struct *child, long data)
  }
  return ret;
  #else /* CONFIG_HAVE_HW_BREAKPOINT */
+    if (child->thread.hw_brk[data - 1].flags & HW_BRK_FLAG_DISABLED)


I think child->thread.hw_brk[data - 1].flags & HW_BRK_FLAG_DISABLED should go 
around additionnal ()


Not sure I follow.




+    goto del;
+
  if (child->thread.hw_brk[data - 1].address == 0)
  return -ENOENT;


What about replacing the above if by:
 if (!(child->thread.hw_brk[data - 1].flags) & HW_BRK_FLAG_DISABLED) &&
     child->thread.hw_brk[data - 1].address == 0)
     return -ENOENT;

okay.. that's more compact.

But more importantly, what I wanted to know is whether CONFIG_HAVE_HW_BREAKPOINT
is set or not in production/distro builds for 8xx. Because I see it's not set in
8xx defconfigs.

Thanks,
Ravi


[PATCH v5 5/8] powerpc/watchpoint: Fix exception handling for CONFIG_HAVE_HW_BREAKPOINT=N

2020-08-24 Thread Ravi Bangoria
On powerpc, ptrace watchpoint works in one-shot mode. i.e. kernel
disables event every time it fires and user has to re-enable it.
Also, in case of ptrace watchpoint, kernel notifies ptrace user
before executing instruction.

With CONFIG_HAVE_HW_BREAKPOINT=N, kernel is missing to disable
ptrace event and thus it's causing infinite loop of exceptions.
This is especially harmful when user watches on a data which is
also read/written by kernel, eg syscall parameters. In such case,
infinite exceptions happens in kernel mode which causes soft-lockup.

Fixes: 9422de3e953d ("powerpc: Hardware breakpoints rewrite to handle non DABR 
breakpoint registers")
Reported-by: Pedro Miraglia Franco de Carvalho 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/hw_breakpoint.h  |  3 ++
 arch/powerpc/kernel/process.c | 48 +++
 arch/powerpc/kernel/ptrace/ptrace-noadv.c |  5 +++
 3 files changed, 56 insertions(+)

diff --git a/arch/powerpc/include/asm/hw_breakpoint.h 
b/arch/powerpc/include/asm/hw_breakpoint.h
index 2eca3dd54b55..c72263214d3f 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -18,6 +18,7 @@ struct arch_hw_breakpoint {
u16 type;
u16 len; /* length of the target data symbol */
u16 hw_len; /* length programmed in hw */
+   u8  flags;
 };
 
 /* Note: Don't change the first 6 bits below as they are in the same order
@@ -37,6 +38,8 @@ struct arch_hw_breakpoint {
 #define HW_BRK_TYPE_PRIV_ALL   (HW_BRK_TYPE_USER | HW_BRK_TYPE_KERNEL | \
 HW_BRK_TYPE_HYP)
 
+#define HW_BRK_FLAG_DISABLED   0x1
+
 /* Minimum granularity */
 #ifdef CONFIG_PPC_8xx
 #define HW_BREAKPOINT_SIZE  0x4
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 016bd831908e..160fbbf41d40 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -636,6 +636,44 @@ void do_send_trap(struct pt_regs *regs, unsigned long 
address,
(void __user *)address);
 }
 #else  /* !CONFIG_PPC_ADV_DEBUG_REGS */
+
+static void do_break_handler(struct pt_regs *regs)
+{
+   struct arch_hw_breakpoint null_brk = {0};
+   struct arch_hw_breakpoint *info;
+   struct ppc_inst instr = ppc_inst(0);
+   int type = 0;
+   int size = 0;
+   unsigned long ea;
+   int i;
+
+   /*
+* If underneath hw supports only one watchpoint, we know it
+* caused exception. 8xx also falls into this category.
+*/
+   if (nr_wp_slots() == 1) {
+   __set_breakpoint(0, _brk);
+   current->thread.hw_brk[0] = null_brk;
+   current->thread.hw_brk[0].flags |= HW_BRK_FLAG_DISABLED;
+   return;
+   }
+
+   /* Otherwise findout which DAWR caused exception and disable it. */
+   wp_get_instr_detail(regs, , , , );
+
+   for (i = 0; i < nr_wp_slots(); i++) {
+   info = >thread.hw_brk[i];
+   if (!info->address)
+   continue;
+
+   if (wp_check_constraints(regs, instr, ea, type, size, info)) {
+   __set_breakpoint(i, _brk);
+   current->thread.hw_brk[i] = null_brk;
+   current->thread.hw_brk[i].flags |= HW_BRK_FLAG_DISABLED;
+   }
+   }
+}
+
 void do_break (struct pt_regs *regs, unsigned long address,
unsigned long error_code)
 {
@@ -647,6 +685,16 @@ void do_break (struct pt_regs *regs, unsigned long address,
if (debugger_break_match(regs))
return;
 
+   /*
+* We reach here only when watchpoint exception is generated by ptrace
+* event (or hw is buggy!). Now if CONFIG_HAVE_HW_BREAKPOINT is set,
+* watchpoint is already handled by hw_breakpoint_handler() so we don't
+* have to do anything. But when CONFIG_HAVE_HW_BREAKPOINT is not set,
+* we need to manually handle the watchpoint here.
+*/
+   if (!IS_ENABLED(CONFIG_HAVE_HW_BREAKPOINT))
+   do_break_handler(regs);
+
/* Deliver the signal to userspace */
force_sig_fault(SIGTRAP, TRAP_HWBKPT, (void __user *)address);
 }
diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index 57a0ab822334..866597b407bc 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -286,11 +286,16 @@ long ppc_del_hwdebug(struct task_struct *child, long data)
}
return ret;
 #else /* CONFIG_HAVE_HW_BREAKPOINT */
+   if (child->thread.hw_brk[data - 1].flags & HW_BRK_FLAG_DISABLED)
+   goto del;
+
if (child->thread.hw_brk[data - 1].address == 0)
return -ENOENT;
 
+del:
child->thread.hw_brk[data - 1].address = 0;
   

[PATCH v5 8/8] powerpc/watchpoint/selftests: Tests for kernel accessing user memory

2020-08-24 Thread Ravi Bangoria
Introduce tests to cover simple scenarios where user is watching
memory which can be accessed by kernel as well. We also support
_MODE_EXACT with _SETHWDEBUG interface. Move those testcases out-
side of _BP_RANGE condition. This will help to test _MODE_EXACT
scenarios when CONFIG_HAVE_HW_BREAKPOINT is not set, eg:

  $ ./ptrace-hwbreak
  ...
  PTRACE_SET_DEBUGREG, Kernel Access Userspace, len: 8: Ok
  PPC_PTRACE_SETHWDEBUG, MODE_EXACT, WO, len: 1: Ok
  PPC_PTRACE_SETHWDEBUG, MODE_EXACT, RO, len: 1: Ok
  PPC_PTRACE_SETHWDEBUG, MODE_EXACT, RW, len: 1: Ok
  PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace, len: 1: Ok
  success: ptrace-hwbreak

Suggested-by: Pedro Miraglia Franco de Carvalho 
Signed-off-by: Ravi Bangoria 
---
 .../selftests/powerpc/ptrace/ptrace-hwbreak.c | 48 ++-
 1 file changed, 46 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
index fc477dfe86a2..2e0d86e0687e 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include "ptrace.h"
 
 #define SPRN_PVR   0x11F
@@ -44,6 +46,7 @@ struct gstruct {
 };
 static volatile struct gstruct gstruct __attribute__((aligned(512)));
 
+static volatile char cwd[PATH_MAX] __attribute__((aligned(8)));
 
 static void get_dbginfo(pid_t child_pid, struct ppc_debug_info *dbginfo)
 {
@@ -138,6 +141,9 @@ static void test_workload(void)
write_var(len);
}
 
+   /* PTRACE_SET_DEBUGREG, Kernel Access Userspace test */
+   syscall(__NR_getcwd, , PATH_MAX);
+
/* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, WO test */
write_var(1);
 
@@ -150,6 +156,9 @@ static void test_workload(void)
else
read_var(1);
 
+   /* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace test */
+   syscall(__NR_getcwd, , PATH_MAX);
+
/* PPC_PTRACE_SETHWDEBUG, MODE_RANGE, DW ALIGNED, WO test */
gstruct.a[rand() % A_LEN] = 'a';
 
@@ -293,6 +302,24 @@ static int test_set_debugreg(pid_t child_pid)
return 0;
 }
 
+static int test_set_debugreg_kernel_userspace(pid_t child_pid)
+{
+   unsigned long wp_addr = (unsigned long)cwd;
+   char *name = "PTRACE_SET_DEBUGREG";
+
+   /* PTRACE_SET_DEBUGREG, Kernel Access Userspace test */
+   wp_addr &= ~0x7UL;
+   wp_addr |= (1Ul << DABR_READ_SHIFT);
+   wp_addr |= (1UL << DABR_WRITE_SHIFT);
+   wp_addr |= (1UL << DABR_TRANSLATION_SHIFT);
+   ptrace_set_debugreg(child_pid, wp_addr);
+   ptrace(PTRACE_CONT, child_pid, NULL, 0);
+   check_success(child_pid, name, "Kernel Access Userspace", wp_addr, 8);
+
+   ptrace_set_debugreg(child_pid, 0);
+   return 0;
+}
+
 static void get_ppc_hw_breakpoint(struct ppc_hw_breakpoint *info, int type,
  unsigned long addr, int len)
 {
@@ -338,6 +365,22 @@ static void test_sethwdebug_exact(pid_t child_pid)
ptrace_delhwdebug(child_pid, wh);
 }
 
+static void test_sethwdebug_exact_kernel_userspace(pid_t child_pid)
+{
+   struct ppc_hw_breakpoint info;
+   unsigned long wp_addr = (unsigned long)
+   char *name = "PPC_PTRACE_SETHWDEBUG, MODE_EXACT";
+   int len = 1; /* hardcoded in kernel */
+   int wh;
+
+   /* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace test */
+   get_ppc_hw_breakpoint(, PPC_BREAKPOINT_TRIGGER_WRITE, wp_addr, 0);
+   wh = ptrace_sethwdebug(child_pid, );
+   ptrace(PTRACE_CONT, child_pid, NULL, 0);
+   check_success(child_pid, name, "Kernel Access Userspace", wp_addr, len);
+   ptrace_delhwdebug(child_pid, wh);
+}
+
 static void test_sethwdebug_range_aligned(pid_t child_pid)
 {
struct ppc_hw_breakpoint info;
@@ -452,9 +495,10 @@ static void
 run_tests(pid_t child_pid, struct ppc_debug_info *dbginfo, bool dawr)
 {
test_set_debugreg(child_pid);
+   test_set_debugreg_kernel_userspace(child_pid);
+   test_sethwdebug_exact(child_pid);
+   test_sethwdebug_exact_kernel_userspace(child_pid);
if (dbginfo->features & PPC_DEBUG_FEATURE_DATA_BP_RANGE) {
-   test_sethwdebug_exact(child_pid);
-
test_sethwdebug_range_aligned(child_pid);
if (dawr || is_8xx) {
test_sethwdebug_range_unaligned(child_pid);
-- 
2.26.2



[PATCH v5 7/8] powerpc/watchpoint/ptrace: Introduce PPC_DEBUG_FEATURE_DATA_BP_ARCH_31

2020-08-24 Thread Ravi Bangoria
PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 can be used to determine whether
we are running on an ISA 3.1 compliant machine. Which is needed to
determine DAR behaviour, 512 byte boundary limit etc. This was
requested by Pedro Miraglia Franco de Carvalho for extending
watchpoint features in gdb. Note that availability of 2nd DAWR is
independent of this flag and should be checked using
ppc_debug_info->num_data_bps.

Signed-off-by: Ravi Bangoria 
---
 Documentation/powerpc/ptrace.rst  | 1 +
 arch/powerpc/include/uapi/asm/ptrace.h| 1 +
 arch/powerpc/kernel/ptrace/ptrace-noadv.c | 2 ++
 3 files changed, 4 insertions(+)

diff --git a/Documentation/powerpc/ptrace.rst b/Documentation/powerpc/ptrace.rst
index 864d4b61..77725d69eb4a 100644
--- a/Documentation/powerpc/ptrace.rst
+++ b/Documentation/powerpc/ptrace.rst
@@ -46,6 +46,7 @@ features will have bits indicating whether there is support 
for::
   #define PPC_DEBUG_FEATURE_DATA_BP_RANGE  0x4
   #define PPC_DEBUG_FEATURE_DATA_BP_MASK   0x8
   #define PPC_DEBUG_FEATURE_DATA_BP_DAWR   0x10
+  #define PPC_DEBUG_FEATURE_DATA_BP_ARCH_310x20
 
 2. PTRACE_SETHWDEBUG
 
diff --git a/arch/powerpc/include/uapi/asm/ptrace.h 
b/arch/powerpc/include/uapi/asm/ptrace.h
index f5f1ccc740fc..7004cfea3f5f 100644
--- a/arch/powerpc/include/uapi/asm/ptrace.h
+++ b/arch/powerpc/include/uapi/asm/ptrace.h
@@ -222,6 +222,7 @@ struct ppc_debug_info {
 #define PPC_DEBUG_FEATURE_DATA_BP_RANGE0x0004
 #define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x0008
 #define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x0010
+#define PPC_DEBUG_FEATURE_DATA_BP_ARCH_31  0x0020
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index 081c39842d84..1d0235db3c1b 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -57,6 +57,8 @@ void ppc_gethwdinfo(struct ppc_debug_info *dbginfo)
} else {
dbginfo->features = 0;
}
+   if (cpu_has_feature(CPU_FTR_ARCH_31))
+   dbginfo->features |= PPC_DEBUG_FEATURE_DATA_BP_ARCH_31;
 }
 
 int ptrace_get_debugreg(struct task_struct *child, unsigned long addr,
-- 
2.26.2



[PATCH v5 6/8] powerpc/watchpoint: Add hw_len wherever missing

2020-08-24 Thread Ravi Bangoria
There are couple of places where we set len but not hw_len. For
ptrace/perf watchpoints, when CONFIG_HAVE_HW_BREAKPOINT=Y, hw_len
will be calculated and set internally while parsing watchpoint.
But when CONFIG_HAVE_HW_BREAKPOINT=N, we need to manually set
'hw_len'. Similarly for xmon as well, hw_len needs to be set
directly.

Fixes: b57aeab811db ("powerpc/watchpoint: Fix length calculation for unaligned 
target")
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/kernel/ptrace/ptrace-noadv.c | 1 +
 arch/powerpc/xmon/xmon.c  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index 866597b407bc..081c39842d84 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -219,6 +219,7 @@ long ppc_set_hwdebug(struct task_struct *child, struct 
ppc_hw_breakpoint *bp_inf
brk.address = ALIGN_DOWN(bp_info->addr, HW_BREAKPOINT_SIZE);
brk.type = HW_BRK_TYPE_TRANSLATE | HW_BRK_TYPE_PRIV_ALL;
brk.len = DABR_MAX_LEN;
+   brk.hw_len = DABR_MAX_LEN;
if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_READ)
brk.type |= HW_BRK_TYPE_READ;
if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_WRITE)
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index df7bca00f5ec..55c43a6c9111 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -969,6 +969,7 @@ static void insert_cpu_bpts(void)
brk.address = dabr[i].address;
brk.type = (dabr[i].enabled & HW_BRK_TYPE_DABR) | 
HW_BRK_TYPE_PRIV_ALL;
brk.len = 8;
+   brk.hw_len = 8;
__set_breakpoint(i, );
}
}
-- 
2.26.2



[PATCH v5 4/8] powerpc/watchpoint: Move DAWR detection logic outside of hw_breakpoint.c

2020-08-24 Thread Ravi Bangoria
Power10 hw has multiple DAWRs but hw doesn't tell which DAWR caused
the exception. So we have a sw logic to detect that in hw_breakpoint.c.
But hw_breakpoint.c gets compiled only with CONFIG_HAVE_HW_BREAKPOINT=Y.
Move DAWR detection logic outside of hw_breakpoint.c so that it can be
reused when CONFIG_HAVE_HW_BREAKPOINT is not set.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/hw_breakpoint.h  |   8 +
 arch/powerpc/kernel/Makefile  |   3 +-
 arch/powerpc/kernel/hw_breakpoint.c   | 159 +
 .../kernel/hw_breakpoint_constraints.c| 162 ++
 4 files changed, 174 insertions(+), 158 deletions(-)
 create mode 100644 arch/powerpc/kernel/hw_breakpoint_constraints.c

diff --git a/arch/powerpc/include/asm/hw_breakpoint.h 
b/arch/powerpc/include/asm/hw_breakpoint.h
index da38e05e04d9..2eca3dd54b55 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -10,6 +10,7 @@
 #define _PPC_BOOK3S_64_HW_BREAKPOINT_H
 
 #include 
+#include 
 
 #ifdef __KERNEL__
 struct arch_hw_breakpoint {
@@ -52,6 +53,13 @@ static inline int nr_wp_slots(void)
return cpu_has_feature(CPU_FTR_DAWR1) ? 2 : 1;
 }
 
+bool wp_check_constraints(struct pt_regs *regs, struct ppc_inst instr,
+ unsigned long ea, int type, int size,
+ struct arch_hw_breakpoint *info);
+
+void wp_get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
+int *type, int *size, unsigned long *ea);
+
 #ifdef CONFIG_HAVE_HW_BREAKPOINT
 #include 
 #include 
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index cbf41fb4ee89..a5550c2b24c4 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -45,7 +45,8 @@ obj-y := cputable.o syscalls.o \
   signal.o sysfs.o cacheinfo.o time.o \
   prom.o traps.o setup-common.o \
   udbg.o misc.o io.o misc_$(BITS).o \
-  of_platform.o prom_parse.o firmware.o
+  of_platform.o prom_parse.o firmware.o \
+  hw_breakpoint_constraints.o
 obj-y  += ptrace/
 obj-$(CONFIG_PPC64)+= setup_64.o \
   paca.o nvram_64.o note.o syscall_64.o
diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
b/arch/powerpc/kernel/hw_breakpoint.c
index f6b24838ca3c..f4e8f21046f5 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -494,161 +494,6 @@ void thread_change_pc(struct task_struct *tsk, struct 
pt_regs *regs)
}
 }
 
-static bool dar_in_user_range(unsigned long dar, struct arch_hw_breakpoint 
*info)
-{
-   return ((info->address <= dar) && (dar - info->address < info->len));
-}
-
-static bool ea_user_range_overlaps(unsigned long ea, int size,
-  struct arch_hw_breakpoint *info)
-{
-   return ((ea < info->address + info->len) &&
-   (ea + size > info->address));
-}
-
-static bool dar_in_hw_range(unsigned long dar, struct arch_hw_breakpoint *info)
-{
-   unsigned long hw_start_addr, hw_end_addr;
-
-   hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE);
-   hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE);
-
-   return ((hw_start_addr <= dar) && (hw_end_addr > dar));
-}
-
-static bool ea_hw_range_overlaps(unsigned long ea, int size,
-struct arch_hw_breakpoint *info)
-{
-   unsigned long hw_start_addr, hw_end_addr;
-   unsigned long align_size = HW_BREAKPOINT_SIZE;
-
-   /*
-* On p10 predecessors, quadword is handle differently then
-* other instructions.
-*/
-   if (!cpu_has_feature(CPU_FTR_ARCH_31) && size == 16)
-   align_size = HW_BREAKPOINT_SIZE_QUADWORD;
-
-   hw_start_addr = ALIGN_DOWN(info->address, align_size);
-   hw_end_addr = ALIGN(info->address + info->len, align_size);
-
-   return ((ea < hw_end_addr) && (ea + size > hw_start_addr));
-}
-
-/*
- * If hw has multiple DAWR registers, we also need to check all
- * dawrx constraint bits to confirm this is _really_ a valid event.
- * If type is UNKNOWN, but privilege level matches, consider it as
- * a positive match.
- */
-static bool check_dawrx_constraints(struct pt_regs *regs, int type,
-   struct arch_hw_breakpoint *info)
-{
-   if (OP_IS_LOAD(type) && !(info->type & HW_BRK_TYPE_READ))
-   return false;
-
-   /*
-* The Cache Management instructions other than dcbz never
-* cause a match. i.e. if type is CACHEOP, the instruction
-* is dcbz, and dcbz is tr

[PATCH v5 3/8] powerpc/watchpoint/ptrace: Fix SETHWDEBUG when CONFIG_HAVE_HW_BREAKPOINT=N

2020-08-24 Thread Ravi Bangoria
When kernel is compiled with CONFIG_HAVE_HW_BREAKPOINT=N, user can
still create watchpoint using PPC_PTRACE_SETHWDEBUG, with limited
functionalities. But, such watchpoints are never firing because of
the missing privilege settings. Fix that.

It's safe to set HW_BRK_TYPE_PRIV_ALL because we don't really leak
any kernel address in signal info. Setting HW_BRK_TYPE_PRIV_ALL will
also help to find scenarios when kernel corrupts user memory.

Reported-by: Pedro Miraglia Franco de Carvalho 
Suggested-by: Pedro Miraglia Franco de Carvalho 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/kernel/ptrace/ptrace-noadv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index 697c7e4b5877..57a0ab822334 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -217,7 +217,7 @@ long ppc_set_hwdebug(struct task_struct *child, struct 
ppc_hw_breakpoint *bp_inf
return -EIO;
 
brk.address = ALIGN_DOWN(bp_info->addr, HW_BREAKPOINT_SIZE);
-   brk.type = HW_BRK_TYPE_TRANSLATE;
+   brk.type = HW_BRK_TYPE_TRANSLATE | HW_BRK_TYPE_PRIV_ALL;
brk.len = DABR_MAX_LEN;
if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_READ)
brk.type |= HW_BRK_TYPE_READ;
-- 
2.26.2



[PATCH v5 2/8] powerpc/watchpoint: Fix handling of vector instructions

2020-08-24 Thread Ravi Bangoria
Vector instructions are special because they are always aligned.
Thus unaligned EA needs to be aligned down before comparing it
with watch ranges. Otherwise we might consider valid event as
invalid.

Fixes: 74c6881019b7 ("powerpc/watchpoint: Prepare handler to handle more than 
one watchpoint")
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/kernel/hw_breakpoint.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
b/arch/powerpc/kernel/hw_breakpoint.c
index 9f7df1c37233..f6b24838ca3c 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -644,6 +644,8 @@ static void get_instr_detail(struct pt_regs *regs, struct 
ppc_inst *instr,
if (*type == CACHEOP) {
*size = cache_op_size();
*ea &= ~(*size - 1);
+   } else if (*type == LOAD_VMX || *type == STORE_VMX) {
+   *ea &= ~(*size - 1);
}
 }
 
-- 
2.26.2



[PATCH v5 1/8] powerpc/watchpoint: Fix quarword instruction handling on p10 predecessors

2020-08-24 Thread Ravi Bangoria
On p10 predecessors, watchpoint with quarword access is compared at
quardword length. If the watch range is doubleword or less than that
in a first half of quarword aligned 16 bytes, and if there is any
unaligned quadword access which will access only the 2nd half, the
handler should consider it as extraneous and emulate/single-step it
before continuing.

Reported-by: Pedro Miraglia Franco de Carvalho 
Fixes: 74c6881019b7 ("powerpc/watchpoint: Prepare handler to handle more than 
one watchpoint")
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/hw_breakpoint.h |  3 ++-
 arch/powerpc/kernel/hw_breakpoint.c  | 12 ++--
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/hw_breakpoint.h 
b/arch/powerpc/include/asm/hw_breakpoint.h
index db206a7f38e2..da38e05e04d9 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -40,7 +40,8 @@ struct arch_hw_breakpoint {
 #ifdef CONFIG_PPC_8xx
 #define HW_BREAKPOINT_SIZE  0x4
 #else
-#define HW_BREAKPOINT_SIZE  0x8
+#define HW_BREAKPOINT_SIZE 0x8
+#define HW_BREAKPOINT_SIZE_QUADWORD0x10
 #endif
 
 #define DABR_MAX_LEN   8
diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
b/arch/powerpc/kernel/hw_breakpoint.c
index 1f4a1efa0074..9f7df1c37233 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -520,9 +520,17 @@ static bool ea_hw_range_overlaps(unsigned long ea, int 
size,
 struct arch_hw_breakpoint *info)
 {
unsigned long hw_start_addr, hw_end_addr;
+   unsigned long align_size = HW_BREAKPOINT_SIZE;
 
-   hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE);
-   hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE);
+   /*
+* On p10 predecessors, quadword is handle differently then
+* other instructions.
+*/
+   if (!cpu_has_feature(CPU_FTR_ARCH_31) && size == 16)
+   align_size = HW_BREAKPOINT_SIZE_QUADWORD;
+
+   hw_start_addr = ALIGN_DOWN(info->address, align_size);
+   hw_end_addr = ALIGN(info->address + info->len, align_size);
 
return ((ea < hw_end_addr) && (ea + size > hw_start_addr));
 }
-- 
2.26.2



[PATCH v5 0/8] powerpc/watchpoint: Bug fixes plus new feature flag

2020-08-24 Thread Ravi Bangoria
Patch #1 fixes issue for quardword instruction on p10 predecessors.
Patch #2 fixes issue for vector instructions.
Patch #3 fixes a bug about watchpoint not firing when created with
 ptrace PPC_PTRACE_SETHWDEBUG and CONFIG_HAVE_HW_BREAKPOINT=N.
 The fix uses HW_BRK_TYPE_PRIV_ALL for ptrace user which, I
 guess, should be fine because we don't leak any kernel
 addresses and PRIV_ALL will also help to cover scenarios when
 kernel accesses user memory.
Patch #4,#5 fixes infinite exception bug, again the bug happens only
 with CONFIG_HAVE_HW_BREAKPOINT=N.
Patch #6 fixes two places where we are missing to set hw_len.
Patch #7 introduce new feature bit PPC_DEBUG_FEATURE_DATA_BP_ARCH_31
 which will be set when running on ISA 3.1 compliant machine.
Patch #8 finally adds selftest to test scenarios fixed by patch#2,#3
 and also moves MODE_EXACT tests outside of BP_RANGE condition.

Christophe, let me know if this series breaks something for 8xx.

v4: 
https://lore.kernel.org/r/20200817102330.777537-1-ravi.bango...@linux.ibm.com/

v4->v5:
 - Patch #1 and #2 are new. These bug happen irrespective of
   CONFIG_HAVE_HW_BREAKPOINT.
 - Patch #3 to #8 are carry forwarded from v4
 - Rebased to powerpc/next

Ravi Bangoria (8):
  powerpc/watchpoint: Fix quarword instruction handling on p10
predecessors
  powerpc/watchpoint: Fix handling of vector instructions
  powerpc/watchpoint/ptrace: Fix SETHWDEBUG when
CONFIG_HAVE_HW_BREAKPOINT=N
  powerpc/watchpoint: Move DAWR detection logic outside of
hw_breakpoint.c
  powerpc/watchpoint: Fix exception handling for
CONFIG_HAVE_HW_BREAKPOINT=N
  powerpc/watchpoint: Add hw_len wherever missing
  powerpc/watchpoint/ptrace: Introduce PPC_DEBUG_FEATURE_DATA_BP_ARCH_31
  powerpc/watchpoint/selftests: Tests for kernel accessing user memory

 Documentation/powerpc/ptrace.rst  |   1 +
 arch/powerpc/include/asm/hw_breakpoint.h  |  14 +-
 arch/powerpc/include/uapi/asm/ptrace.h|   1 +
 arch/powerpc/kernel/Makefile  |   3 +-
 arch/powerpc/kernel/hw_breakpoint.c   | 149 +---
 .../kernel/hw_breakpoint_constraints.c| 162 ++
 arch/powerpc/kernel/process.c |  48 ++
 arch/powerpc/kernel/ptrace/ptrace-noadv.c |  10 +-
 arch/powerpc/xmon/xmon.c  |   1 +
 .../selftests/powerpc/ptrace/ptrace-hwbreak.c |  48 +-
 10 files changed, 285 insertions(+), 152 deletions(-)
 create mode 100644 arch/powerpc/kernel/hw_breakpoint_constraints.c

-- 
2.26.2



[PATCH v4 6/6] powerpc/watchpoint/selftests: Tests for kernel accessing user memory

2020-08-17 Thread Ravi Bangoria
Introduce tests to cover simple scenarios where user is watching
memory which can be accessed by kernel as well. We also support
_MODE_EXACT with _SETHWDEBUG interface. Move those testcases out-
side of _BP_RANGE condition. This will help to test _MODE_EXACT
scenarios when CONFIG_HAVE_HW_BREAKPOINT is not set, eg:

  $ ./ptrace-hwbreak
  ...
  PTRACE_SET_DEBUGREG, Kernel Access Userspace, len: 8: Ok
  PPC_PTRACE_SETHWDEBUG, MODE_EXACT, WO, len: 1: Ok
  PPC_PTRACE_SETHWDEBUG, MODE_EXACT, RO, len: 1: Ok
  PPC_PTRACE_SETHWDEBUG, MODE_EXACT, RW, len: 1: Ok
  PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace, len: 1: Ok
  success: ptrace-hwbreak

Suggested-by: Pedro Miraglia Franco de Carvalho 
Signed-off-by: Ravi Bangoria 
---
 .../selftests/powerpc/ptrace/ptrace-hwbreak.c | 48 ++-
 1 file changed, 46 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
index fc477dfe86a2..2e0d86e0687e 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
@@ -20,6 +20,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include "ptrace.h"
 
 #define SPRN_PVR   0x11F
@@ -44,6 +46,7 @@ struct gstruct {
 };
 static volatile struct gstruct gstruct __attribute__((aligned(512)));
 
+static volatile char cwd[PATH_MAX] __attribute__((aligned(8)));
 
 static void get_dbginfo(pid_t child_pid, struct ppc_debug_info *dbginfo)
 {
@@ -138,6 +141,9 @@ static void test_workload(void)
write_var(len);
}
 
+   /* PTRACE_SET_DEBUGREG, Kernel Access Userspace test */
+   syscall(__NR_getcwd, , PATH_MAX);
+
/* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, WO test */
write_var(1);
 
@@ -150,6 +156,9 @@ static void test_workload(void)
else
read_var(1);
 
+   /* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace test */
+   syscall(__NR_getcwd, , PATH_MAX);
+
/* PPC_PTRACE_SETHWDEBUG, MODE_RANGE, DW ALIGNED, WO test */
gstruct.a[rand() % A_LEN] = 'a';
 
@@ -293,6 +302,24 @@ static int test_set_debugreg(pid_t child_pid)
return 0;
 }
 
+static int test_set_debugreg_kernel_userspace(pid_t child_pid)
+{
+   unsigned long wp_addr = (unsigned long)cwd;
+   char *name = "PTRACE_SET_DEBUGREG";
+
+   /* PTRACE_SET_DEBUGREG, Kernel Access Userspace test */
+   wp_addr &= ~0x7UL;
+   wp_addr |= (1Ul << DABR_READ_SHIFT);
+   wp_addr |= (1UL << DABR_WRITE_SHIFT);
+   wp_addr |= (1UL << DABR_TRANSLATION_SHIFT);
+   ptrace_set_debugreg(child_pid, wp_addr);
+   ptrace(PTRACE_CONT, child_pid, NULL, 0);
+   check_success(child_pid, name, "Kernel Access Userspace", wp_addr, 8);
+
+   ptrace_set_debugreg(child_pid, 0);
+   return 0;
+}
+
 static void get_ppc_hw_breakpoint(struct ppc_hw_breakpoint *info, int type,
  unsigned long addr, int len)
 {
@@ -338,6 +365,22 @@ static void test_sethwdebug_exact(pid_t child_pid)
ptrace_delhwdebug(child_pid, wh);
 }
 
+static void test_sethwdebug_exact_kernel_userspace(pid_t child_pid)
+{
+   struct ppc_hw_breakpoint info;
+   unsigned long wp_addr = (unsigned long)
+   char *name = "PPC_PTRACE_SETHWDEBUG, MODE_EXACT";
+   int len = 1; /* hardcoded in kernel */
+   int wh;
+
+   /* PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace test */
+   get_ppc_hw_breakpoint(, PPC_BREAKPOINT_TRIGGER_WRITE, wp_addr, 0);
+   wh = ptrace_sethwdebug(child_pid, );
+   ptrace(PTRACE_CONT, child_pid, NULL, 0);
+   check_success(child_pid, name, "Kernel Access Userspace", wp_addr, len);
+   ptrace_delhwdebug(child_pid, wh);
+}
+
 static void test_sethwdebug_range_aligned(pid_t child_pid)
 {
struct ppc_hw_breakpoint info;
@@ -452,9 +495,10 @@ static void
 run_tests(pid_t child_pid, struct ppc_debug_info *dbginfo, bool dawr)
 {
test_set_debugreg(child_pid);
+   test_set_debugreg_kernel_userspace(child_pid);
+   test_sethwdebug_exact(child_pid);
+   test_sethwdebug_exact_kernel_userspace(child_pid);
if (dbginfo->features & PPC_DEBUG_FEATURE_DATA_BP_RANGE) {
-   test_sethwdebug_exact(child_pid);
-
test_sethwdebug_range_aligned(child_pid);
if (dawr || is_8xx) {
test_sethwdebug_range_unaligned(child_pid);
-- 
2.26.2



[PATCH v4 5/6] powerpc/watchpoint/ptrace: Introduce PPC_DEBUG_FEATURE_DATA_BP_ARCH_31

2020-08-17 Thread Ravi Bangoria
PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 can be used to determine whether
we are running on an ISA 3.1 compliant machine. Which is needed to
determine DAR behaviour, 512 byte boundary limit etc. This was
requested by Pedro Miraglia Franco de Carvalho for extending
watchpoint features in gdb. Note that availability of 2nd DAWR is
independent of this flag and should be checked using
ppc_debug_info->num_data_bps.

Signed-off-by: Ravi Bangoria 
---
 Documentation/powerpc/ptrace.rst  | 1 +
 arch/powerpc/include/uapi/asm/ptrace.h| 1 +
 arch/powerpc/kernel/ptrace/ptrace-noadv.c | 2 ++
 3 files changed, 4 insertions(+)

diff --git a/Documentation/powerpc/ptrace.rst b/Documentation/powerpc/ptrace.rst
index 864d4b61..77725d69eb4a 100644
--- a/Documentation/powerpc/ptrace.rst
+++ b/Documentation/powerpc/ptrace.rst
@@ -46,6 +46,7 @@ features will have bits indicating whether there is support 
for::
   #define PPC_DEBUG_FEATURE_DATA_BP_RANGE  0x4
   #define PPC_DEBUG_FEATURE_DATA_BP_MASK   0x8
   #define PPC_DEBUG_FEATURE_DATA_BP_DAWR   0x10
+  #define PPC_DEBUG_FEATURE_DATA_BP_ARCH_310x20
 
 2. PTRACE_SETHWDEBUG
 
diff --git a/arch/powerpc/include/uapi/asm/ptrace.h 
b/arch/powerpc/include/uapi/asm/ptrace.h
index f5f1ccc740fc..7004cfea3f5f 100644
--- a/arch/powerpc/include/uapi/asm/ptrace.h
+++ b/arch/powerpc/include/uapi/asm/ptrace.h
@@ -222,6 +222,7 @@ struct ppc_debug_info {
 #define PPC_DEBUG_FEATURE_DATA_BP_RANGE0x0004
 #define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x0008
 #define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x0010
+#define PPC_DEBUG_FEATURE_DATA_BP_ARCH_31  0x0020
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index 081c39842d84..1d0235db3c1b 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -57,6 +57,8 @@ void ppc_gethwdinfo(struct ppc_debug_info *dbginfo)
} else {
dbginfo->features = 0;
}
+   if (cpu_has_feature(CPU_FTR_ARCH_31))
+   dbginfo->features |= PPC_DEBUG_FEATURE_DATA_BP_ARCH_31;
 }
 
 int ptrace_get_debugreg(struct task_struct *child, unsigned long addr,
-- 
2.26.2



[PATCH v4 4/6] powerpc/watchpoint: Add hw_len wherever missing

2020-08-17 Thread Ravi Bangoria
There are couple of places where we set len but not hw_len. For
ptrace/perf watchpoints, when CONFIG_HAVE_HW_BREAKPOINT=Y, hw_len
will be calculated and set internally while parsing watchpoint.
But when CONFIG_HAVE_HW_BREAKPOINT=N, we need to manually set
'hw_len'. Similarly for xmon as well, hw_len needs to be set
directly.

Fixes: b57aeab811db ("powerpc/watchpoint: Fix length calculation for unaligned 
target")
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/kernel/ptrace/ptrace-noadv.c | 1 +
 arch/powerpc/xmon/xmon.c  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index 866597b407bc..081c39842d84 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -219,6 +219,7 @@ long ppc_set_hwdebug(struct task_struct *child, struct 
ppc_hw_breakpoint *bp_inf
brk.address = ALIGN_DOWN(bp_info->addr, HW_BREAKPOINT_SIZE);
brk.type = HW_BRK_TYPE_TRANSLATE | HW_BRK_TYPE_PRIV_ALL;
brk.len = DABR_MAX_LEN;
+   brk.hw_len = DABR_MAX_LEN;
if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_READ)
brk.type |= HW_BRK_TYPE_READ;
if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_WRITE)
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index df7bca00f5ec..55c43a6c9111 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -969,6 +969,7 @@ static void insert_cpu_bpts(void)
brk.address = dabr[i].address;
brk.type = (dabr[i].enabled & HW_BRK_TYPE_DABR) | 
HW_BRK_TYPE_PRIV_ALL;
brk.len = 8;
+   brk.hw_len = 8;
__set_breakpoint(i, );
}
}
-- 
2.26.2



[PATCH v4 3/6] powerpc/watchpoint: Fix exception handling for CONFIG_HAVE_HW_BREAKPOINT=N

2020-08-17 Thread Ravi Bangoria
On powerpc, ptrace watchpoint works in one-shot mode. i.e. kernel
disables event every time it fires and user has to re-enable it.
Also, in case of ptrace watchpoint, kernel notifies ptrace user
before executing instruction.

With CONFIG_HAVE_HW_BREAKPOINT=N, kernel is missing to disable
ptrace event and thus it's causing infinite loop of exceptions.
This is especially harmful when user watches on a data which is
also read/written by kernel, eg syscall parameters. In such case,
infinite exceptions happens in kernel mode which causes soft-lockup.

Fixes: 9422de3e953d ("powerpc: Hardware breakpoints rewrite to handle non DABR 
breakpoint registers")
Reported-by: Pedro Miraglia Franco de Carvalho 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/hw_breakpoint.h  |  3 ++
 arch/powerpc/kernel/process.c | 48 +++
 arch/powerpc/kernel/ptrace/ptrace-noadv.c |  5 +++
 3 files changed, 56 insertions(+)

diff --git a/arch/powerpc/include/asm/hw_breakpoint.h 
b/arch/powerpc/include/asm/hw_breakpoint.h
index f71f08a7e2e0..90d5b3a9f433 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -18,6 +18,7 @@ struct arch_hw_breakpoint {
u16 type;
u16 len; /* length of the target data symbol */
u16 hw_len; /* length programmed in hw */
+   u8  flags;
 };
 
 /* Note: Don't change the first 6 bits below as they are in the same order
@@ -37,6 +38,8 @@ struct arch_hw_breakpoint {
 #define HW_BRK_TYPE_PRIV_ALL   (HW_BRK_TYPE_USER | HW_BRK_TYPE_KERNEL | \
 HW_BRK_TYPE_HYP)
 
+#define HW_BRK_FLAG_DISABLED   0x1
+
 /* Minimum granularity */
 #ifdef CONFIG_PPC_8xx
 #define HW_BREAKPOINT_SIZE  0x4
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index e4fcc817c46c..cab6febe6eb6 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -636,6 +636,44 @@ void do_send_trap(struct pt_regs *regs, unsigned long 
address,
(void __user *)address);
 }
 #else  /* !CONFIG_PPC_ADV_DEBUG_REGS */
+
+static void do_break_handler(struct pt_regs *regs)
+{
+   struct arch_hw_breakpoint null_brk = {0};
+   struct arch_hw_breakpoint *info;
+   struct ppc_inst instr = ppc_inst(0);
+   int type = 0;
+   int size = 0;
+   unsigned long ea;
+   int i;
+
+   /*
+* If underneath hw supports only one watchpoint, we know it
+* caused exception. 8xx also falls into this category.
+*/
+   if (nr_wp_slots() == 1) {
+   __set_breakpoint(0, _brk);
+   current->thread.hw_brk[0] = null_brk;
+   current->thread.hw_brk[0].flags |= HW_BRK_FLAG_DISABLED;
+   return;
+   }
+
+   /* Otherwise findout which DAWR caused exception and disable it. */
+   wp_get_instr_detail(regs, , , , );
+
+   for (i = 0; i < nr_wp_slots(); i++) {
+   info = >thread.hw_brk[i];
+   if (!info->address)
+   continue;
+
+   if (wp_check_constraints(regs, instr, ea, type, size, info)) {
+   __set_breakpoint(i, _brk);
+   current->thread.hw_brk[i] = null_brk;
+   current->thread.hw_brk[i].flags |= HW_BRK_FLAG_DISABLED;
+   }
+   }
+}
+
 void do_break (struct pt_regs *regs, unsigned long address,
unsigned long error_code)
 {
@@ -647,6 +685,16 @@ void do_break (struct pt_regs *regs, unsigned long address,
if (debugger_break_match(regs))
return;
 
+   /*
+* We reach here only when watchpoint exception is generated by ptrace
+* event (or hw is buggy!). Now if CONFIG_HAVE_HW_BREAKPOINT is set,
+* watchpoint is already handled by hw_breakpoint_handler() so we don't
+* have to do anything. But when CONFIG_HAVE_HW_BREAKPOINT is not set,
+* we need to manually handle the watchpoint here.
+*/
+   if (!IS_ENABLED(CONFIG_HAVE_HW_BREAKPOINT))
+   do_break_handler(regs);
+
/* Deliver the signal to userspace */
force_sig_fault(SIGTRAP, TRAP_HWBKPT, (void __user *)address);
 }
diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index 57a0ab822334..866597b407bc 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -286,11 +286,16 @@ long ppc_del_hwdebug(struct task_struct *child, long data)
}
return ret;
 #else /* CONFIG_HAVE_HW_BREAKPOINT */
+   if (child->thread.hw_brk[data - 1].flags & HW_BRK_FLAG_DISABLED)
+   goto del;
+
if (child->thread.hw_brk[data - 1].address == 0)
return -ENOENT;
 
+del:
child->thread.hw_brk[data - 1].address = 0;
   

[PATCH v4 1/6] powerpc/watchpoint/ptrace: Fix SETHWDEBUG when CONFIG_HAVE_HW_BREAKPOINT=N

2020-08-17 Thread Ravi Bangoria
When kernel is compiled with CONFIG_HAVE_HW_BREAKPOINT=N, user can
still create watchpoint using PPC_PTRACE_SETHWDEBUG, with limited
functionalities. But, such watchpoints are never firing because of
the missing privilege settings. Fix that.

It's safe to set HW_BRK_TYPE_PRIV_ALL because we don't really leak
any kernel address in signal info. Setting HW_BRK_TYPE_PRIV_ALL will
also help to find scenarios when kernel corrupts user memory.

Reported-by: Pedro Miraglia Franco de Carvalho 
Suggested-by: Pedro Miraglia Franco de Carvalho 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/kernel/ptrace/ptrace-noadv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index 697c7e4b5877..57a0ab822334 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -217,7 +217,7 @@ long ppc_set_hwdebug(struct task_struct *child, struct 
ppc_hw_breakpoint *bp_inf
return -EIO;
 
brk.address = ALIGN_DOWN(bp_info->addr, HW_BREAKPOINT_SIZE);
-   brk.type = HW_BRK_TYPE_TRANSLATE;
+   brk.type = HW_BRK_TYPE_TRANSLATE | HW_BRK_TYPE_PRIV_ALL;
brk.len = DABR_MAX_LEN;
if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_READ)
brk.type |= HW_BRK_TYPE_READ;
-- 
2.26.2



[PATCH v4 2/6] powerpc/watchpoint: Move DAWR detection logic outside of hw_breakpoint.c

2020-08-17 Thread Ravi Bangoria
Power10 hw has multiple DAWRs but hw doesn't tell which DAWR caused
the exception. So we have a sw logic to detect that in hw_breakpoint.c.
But hw_breakpoint.c gets compiled only with CONFIG_HAVE_HW_BREAKPOINT=Y.
Move DAWR detection logic outside of hw_breakpoint.c so that it can be
reused when CONFIG_HAVE_HW_BREAKPOINT is not set.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/hw_breakpoint.h  |   8 +
 arch/powerpc/kernel/Makefile  |   3 +-
 arch/powerpc/kernel/hw_breakpoint.c   | 149 +
 .../kernel/hw_breakpoint_constraints.c| 152 ++
 4 files changed, 164 insertions(+), 148 deletions(-)
 create mode 100644 arch/powerpc/kernel/hw_breakpoint_constraints.c

diff --git a/arch/powerpc/include/asm/hw_breakpoint.h 
b/arch/powerpc/include/asm/hw_breakpoint.h
index db206a7f38e2..f71f08a7e2e0 100644
--- a/arch/powerpc/include/asm/hw_breakpoint.h
+++ b/arch/powerpc/include/asm/hw_breakpoint.h
@@ -10,6 +10,7 @@
 #define _PPC_BOOK3S_64_HW_BREAKPOINT_H
 
 #include 
+#include 
 
 #ifdef __KERNEL__
 struct arch_hw_breakpoint {
@@ -51,6 +52,13 @@ static inline int nr_wp_slots(void)
return cpu_has_feature(CPU_FTR_DAWR1) ? 2 : 1;
 }
 
+bool wp_check_constraints(struct pt_regs *regs, struct ppc_inst instr,
+ unsigned long ea, int type, int size,
+ struct arch_hw_breakpoint *info);
+
+void wp_get_instr_detail(struct pt_regs *regs, struct ppc_inst *instr,
+int *type, int *size, unsigned long *ea);
+
 #ifdef CONFIG_HAVE_HW_BREAKPOINT
 #include 
 #include 
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index d4d5946224f8..286f85a103de 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -45,7 +45,8 @@ obj-y := cputable.o syscalls.o \
   signal.o sysfs.o cacheinfo.o time.o \
   prom.o traps.o setup-common.o \
   udbg.o misc.o io.o misc_$(BITS).o \
-  of_platform.o prom_parse.o firmware.o
+  of_platform.o prom_parse.o firmware.o \
+  hw_breakpoint_constraints.o
 obj-y  += ptrace/
 obj-$(CONFIG_PPC64)+= setup_64.o \
   paca.o nvram_64.o note.o syscall_64.o
diff --git a/arch/powerpc/kernel/hw_breakpoint.c 
b/arch/powerpc/kernel/hw_breakpoint.c
index 1f4a1efa0074..f4e8f21046f5 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -494,151 +494,6 @@ void thread_change_pc(struct task_struct *tsk, struct 
pt_regs *regs)
}
 }
 
-static bool dar_in_user_range(unsigned long dar, struct arch_hw_breakpoint 
*info)
-{
-   return ((info->address <= dar) && (dar - info->address < info->len));
-}
-
-static bool ea_user_range_overlaps(unsigned long ea, int size,
-  struct arch_hw_breakpoint *info)
-{
-   return ((ea < info->address + info->len) &&
-   (ea + size > info->address));
-}
-
-static bool dar_in_hw_range(unsigned long dar, struct arch_hw_breakpoint *info)
-{
-   unsigned long hw_start_addr, hw_end_addr;
-
-   hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE);
-   hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE);
-
-   return ((hw_start_addr <= dar) && (hw_end_addr > dar));
-}
-
-static bool ea_hw_range_overlaps(unsigned long ea, int size,
-struct arch_hw_breakpoint *info)
-{
-   unsigned long hw_start_addr, hw_end_addr;
-
-   hw_start_addr = ALIGN_DOWN(info->address, HW_BREAKPOINT_SIZE);
-   hw_end_addr = ALIGN(info->address + info->len, HW_BREAKPOINT_SIZE);
-
-   return ((ea < hw_end_addr) && (ea + size > hw_start_addr));
-}
-
-/*
- * If hw has multiple DAWR registers, we also need to check all
- * dawrx constraint bits to confirm this is _really_ a valid event.
- * If type is UNKNOWN, but privilege level matches, consider it as
- * a positive match.
- */
-static bool check_dawrx_constraints(struct pt_regs *regs, int type,
-   struct arch_hw_breakpoint *info)
-{
-   if (OP_IS_LOAD(type) && !(info->type & HW_BRK_TYPE_READ))
-   return false;
-
-   /*
-* The Cache Management instructions other than dcbz never
-* cause a match. i.e. if type is CACHEOP, the instruction
-* is dcbz, and dcbz is treated as Store.
-*/
-   if ((OP_IS_STORE(type) || type == CACHEOP) && !(info->type & 
HW_BRK_TYPE_WRITE))
-   return false;
-
-   if (is_kernel_addr(regs->nip) && !(info->type & HW_BRK_TYPE_KERNEL))
-   return false;

[PATCH v4 0/6] powerpc/watchpoint: Bug fixes plus new feature flag

2020-08-17 Thread Ravi Bangoria
Patch #1 fixes a bug about watchpoint not firing when created with
 ptrace PPC_PTRACE_SETHWDEBUG and CONFIG_HAVE_HW_BREAKPOINT=N.
 The fix uses HW_BRK_TYPE_PRIV_ALL for ptrace user which, I
 guess, should be fine because we don't leak any kernel
 addresses and PRIV_ALL will also help to cover scenarios when
 kernel accesses user memory.
Patch #2,#3 fixes infinite exception bug, again the bug happens only
 with CONFIG_HAVE_HW_BREAKPOINT=N.
Patch #4 fixes two places where we are missing to set hw_len.
Patch #5 introduce new feature bit PPC_DEBUG_FEATURE_DATA_BP_ARCH_31
 which will be set when running on ISA 3.1 compliant machine.
Patch #6 finally adds selftest to test scenarios fixed by patch#2,#3
 and also moves MODE_EXACT tests outside of BP_RANGE condition.

Christophe, let me know if this series breaks something for 8xx.

v3: 
https://lore.kernel.org/r/20200805062750.290289-1-ravi.bango...@linux.ibm.com

v3->v4:
  - Patch #1. Allow HW_BRK_TYPE_PRIV_ALL instead of HW_BRK_TYPE_USER
  - Patch #2, #3, #4 and #6 are new.
  - Rebased to powerpc/next.

Ravi Bangoria (6):
  powerpc/watchpoint/ptrace: Fix SETHWDEBUG when
CONFIG_HAVE_HW_BREAKPOINT=N
  powerpc/watchpoint: Move DAWR detection logic outside of
hw_breakpoint.c
  powerpc/watchpoint: Fix exception handling for
CONFIG_HAVE_HW_BREAKPOINT=N
  powerpc/watchpoint: Add hw_len wherever missing
  powerpc/watchpoint/ptrace: Introduce PPC_DEBUG_FEATURE_DATA_BP_ARCH_31
  powerpc/watchpoint/selftests: Tests for kernel accessing user memory

 Documentation/powerpc/ptrace.rst  |   1 +
 arch/powerpc/include/asm/hw_breakpoint.h  |  11 ++
 arch/powerpc/include/uapi/asm/ptrace.h|   1 +
 arch/powerpc/kernel/Makefile  |   3 +-
 arch/powerpc/kernel/hw_breakpoint.c   | 149 +
 .../kernel/hw_breakpoint_constraints.c| 152 ++
 arch/powerpc/kernel/process.c |  48 ++
 arch/powerpc/kernel/ptrace/ptrace-noadv.c |  10 +-
 arch/powerpc/xmon/xmon.c  |   1 +
 .../selftests/powerpc/ptrace/ptrace-hwbreak.c |  48 +-
 10 files changed, 273 insertions(+), 151 deletions(-)
 create mode 100644 arch/powerpc/kernel/hw_breakpoint_constraints.c

-- 
2.26.2



[PATCH v3 2/2] powerpc/watchpoint/ptrace: Introduce PPC_DEBUG_FEATURE_DATA_BP_ARCH_31

2020-08-05 Thread Ravi Bangoria
PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 can be used to determine whether
we are running on an ISA 3.1 compliant machine. Which is needed to
determine DAR behaviour, 512 byte boundary limit etc. This was
requested by Pedro Miraglia Franco de Carvalho for extending
watchpoint features in gdb. Note that availability of 2nd DAWR is
independent of this flag and should be checked using
ppc_debug_info->num_data_bps.

Signed-off-by: Ravi Bangoria 
---
 Documentation/powerpc/ptrace.rst  | 1 +
 arch/powerpc/include/uapi/asm/ptrace.h| 1 +
 arch/powerpc/kernel/ptrace/ptrace-noadv.c | 2 ++
 3 files changed, 4 insertions(+)

diff --git a/Documentation/powerpc/ptrace.rst b/Documentation/powerpc/ptrace.rst
index 864d4b61..77725d69eb4a 100644
--- a/Documentation/powerpc/ptrace.rst
+++ b/Documentation/powerpc/ptrace.rst
@@ -46,6 +46,7 @@ features will have bits indicating whether there is support 
for::
   #define PPC_DEBUG_FEATURE_DATA_BP_RANGE  0x4
   #define PPC_DEBUG_FEATURE_DATA_BP_MASK   0x8
   #define PPC_DEBUG_FEATURE_DATA_BP_DAWR   0x10
+  #define PPC_DEBUG_FEATURE_DATA_BP_ARCH_310x20
 
 2. PTRACE_SETHWDEBUG
 
diff --git a/arch/powerpc/include/uapi/asm/ptrace.h 
b/arch/powerpc/include/uapi/asm/ptrace.h
index f5f1ccc740fc..7004cfea3f5f 100644
--- a/arch/powerpc/include/uapi/asm/ptrace.h
+++ b/arch/powerpc/include/uapi/asm/ptrace.h
@@ -222,6 +222,7 @@ struct ppc_debug_info {
 #define PPC_DEBUG_FEATURE_DATA_BP_RANGE0x0004
 #define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x0008
 #define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x0010
+#define PPC_DEBUG_FEATURE_DATA_BP_ARCH_31  0x0020
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index bf94ab837735..d5be73ab9e74 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -57,6 +57,8 @@ void ppc_gethwdinfo(struct ppc_debug_info *dbginfo)
} else {
dbginfo->features = 0;
}
+   if (cpu_has_feature(CPU_FTR_ARCH_31))
+   dbginfo->features |= PPC_DEBUG_FEATURE_DATA_BP_ARCH_31;
 }
 
 int ptrace_get_debugreg(struct task_struct *child, unsigned long addr,
-- 
2.26.2



[PATCH v3 1/2] powerpc/watchpoint/ptrace: Fix SETHWDEBUG when CONFIG_HAVE_HW_BREAKPOINT=N

2020-08-05 Thread Ravi Bangoria
When kernel is compiled with CONFIG_HAVE_HW_BREAKPOINT=N, user can
still create watchpoint using PPC_PTRACE_SETHWDEBUG, with limited
functionalities. But, such watchpoints are never firing because of
the missing privilege settings. Fix that.

Reported-by: Pedro Miraglia Franco de Carvalho 
Suggested-by: Pedro Miraglia Franco de Carvalho 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/kernel/ptrace/ptrace-noadv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index 697c7e4b5877..bf94ab837735 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -217,7 +217,7 @@ long ppc_set_hwdebug(struct task_struct *child, struct 
ppc_hw_breakpoint *bp_inf
return -EIO;
 
brk.address = ALIGN_DOWN(bp_info->addr, HW_BREAKPOINT_SIZE);
-   brk.type = HW_BRK_TYPE_TRANSLATE;
+   brk.type = HW_BRK_TYPE_TRANSLATE | HW_BRK_TYPE_USER;
brk.len = DABR_MAX_LEN;
if (bp_info->trigger_type & PPC_BREAKPOINT_TRIGGER_READ)
brk.type |= HW_BRK_TYPE_READ;
-- 
2.26.2



[PATCH v3 0/2] powerpc/watchpoint/ptrace: Buf fix plus new feature flag

2020-08-05 Thread Ravi Bangoria
Patch #1 fixes a bug when watchpoint is created with ptrace
PPC_PTRACE_SETHWDEBUG and CONFIG_HAVE_HW_BREAKPOINT=N.
patch #2 introduce new feature bit
PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 which will be set when
running on ISA 3.1 compliant machine.

v2: 
https://lore.kernel.org/r/20200723093330.306341-1-ravi.bango...@linux.ibm.com

v2->v3:
 - #1: is new
 - #2: Set PPC_DEBUG_FEATURE_DATA_BP_ARCH_31 unconditionally
   while running on p10.
 - Rebased to powerpc/next (cf1ae052e073)

Ravi Bangoria (2):
  powerpc/watchpoint/ptrace: Fix SETHWDEBUG when
CONFIG_HAVE_HW_BREAKPOINT=N
  powerpc/watchpoint/ptrace: Introduce PPC_DEBUG_FEATURE_DATA_BP_ARCH_31

 Documentation/powerpc/ptrace.rst  | 1 +
 arch/powerpc/include/uapi/asm/ptrace.h| 1 +
 arch/powerpc/kernel/ptrace/ptrace-noadv.c | 4 +++-
 3 files changed, 5 insertions(+), 1 deletion(-)

-- 
2.26.2



Re: [PATCH V4 0/4] powerpc/perf: Add support for perf extended regs in powerpc

2020-07-27 Thread Ravi Bangoria




On 7/26/20 8:15 PM, Athira Rajeev wrote:

Patch set to add support for perf extended register capability in
powerpc. The capability flag PERF_PMU_CAP_EXTENDED_REGS, is used to
indicate the PMU which support extended registers. The generic code
define the mask of extended registers as 0 for non supported architectures.

Patches 1 and 2 are the kernel side changes needed to include
base support for extended regs in powerpc and in power10.
Patches 3 and 4 are the perf tools side changes needed to support the
extended registers.

patch 1/4 defines the PERF_PMU_CAP_EXTENDED_REGS mask to output the
values of mmcr0,mmcr1,mmcr2 for POWER9. Defines `PERF_REG_EXTENDED_MASK`
at runtime which contains mask value of the supported registers under
extended regs.

patch 2/4 adds the extended regs support for power10 and exposes
MMCR3, SIER2, SIER3 registers as part of extended regs.

Patch 3/4 and 4/4 adds extended regs to sample_reg_mask in the tool
side to use with `-I?` option for power9 and power10 respectively.

Ravi bangoria found an issue with `perf record -I` while testing the
changes. The same issue is currently being worked on here:
https://lkml.org/lkml/2020/7/19/413 and will be resolved once fix
from Jin Yao is merged.

This patch series is based on powerpc/next


Apart from the issue with patch #1, the series LGTM..

Reviewed-and-Tested-by: Ravi Bangoria 



Re: [PATCH V4 1/4] powerpc/perf: Add support for outputting extended regs in perf intr_regs

2020-07-27 Thread Ravi Bangoria

Hi Athira,


+/* Function to return the extended register values */
+static u64 get_ext_regs_value(int idx)
+{
+   switch (idx) {
+   case PERF_REG_POWERPC_MMCR0:
+   return mfspr(SPRN_MMCR0);
+   case PERF_REG_POWERPC_MMCR1:
+   return mfspr(SPRN_MMCR1);
+   case PERF_REG_POWERPC_MMCR2:
+   return mfspr(SPRN_MMCR2);
+   default: return 0;
+   }
+}
+
  u64 perf_reg_value(struct pt_regs *regs, int idx)
  {
-   if (WARN_ON_ONCE(idx >= PERF_REG_POWERPC_MAX))
-   return 0;
+   u64 perf_reg_extended_max = 0;


This should be initialized to PERF_REG_POWERPC_MAX. Default to 0 will always
trigger below WARN_ON_ONCE(idx >= perf_reg_extended_max) on non p9/p10 systems.


+
+   if (cpu_has_feature(CPU_FTR_ARCH_300))
+   perf_reg_extended_max = PERF_REG_MAX_ISA_300;
  
  	if (idx == PERF_REG_POWERPC_SIER &&

   (IS_ENABLED(CONFIG_FSL_EMB_PERF_EVENT) ||
@@ -85,6 +103,16 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
IS_ENABLED(CONFIG_PPC32)))
return 0;
  
+	if (idx >= PERF_REG_POWERPC_MAX && idx < perf_reg_extended_max)

+   return get_ext_regs_value(idx);
+
+   /*
+* If the idx is referring to value beyond the
+* supported registers, return 0 with a warning
+*/
+   if (WARN_ON_ONCE(idx >= perf_reg_extended_max))
+   return 0;
+
return regs_get_register(regs, pt_regs_offset[idx]);
  }


Ravi


Re: [v3 12/15] powerpc/perf: Add support for outputting extended regs in perf intr_regs

2020-07-24 Thread Ravi Bangoria

Hi Athira,


+/* Function to return the extended register values */
+static u64 get_ext_regs_value(int idx)
+{
+   switch (idx) {
+   case PERF_REG_POWERPC_MMCR0:
+   return mfspr(SPRN_MMCR0);
+   case PERF_REG_POWERPC_MMCR1:
+   return mfspr(SPRN_MMCR1);
+   case PERF_REG_POWERPC_MMCR2:
+   return mfspr(SPRN_MMCR2);
+   default: return 0;
+   }
+}
+
  u64 perf_reg_value(struct pt_regs *regs, int idx)
  {
-   if (WARN_ON_ONCE(idx >= PERF_REG_POWERPC_MAX))
-   return 0;
+   u64 PERF_REG_EXTENDED_MAX;


PERF_REG_EXTENDED_MAX should be initialized. otherwise ...


+
+   if (cpu_has_feature(CPU_FTR_ARCH_300))
+   PERF_REG_EXTENDED_MAX = PERF_REG_MAX_ISA_300;
  
  	if (idx == PERF_REG_POWERPC_SIER &&

   (IS_ENABLED(CONFIG_FSL_EMB_PERF_EVENT) ||
@@ -85,6 +103,16 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
IS_ENABLED(CONFIG_PPC32)))
return 0;
  
+	if (idx >= PERF_REG_POWERPC_MAX && idx < PERF_REG_EXTENDED_MAX)

+   return get_ext_regs_value(idx);


On non p9/p10 machine, PERF_REG_EXTENDED_MAX may contain random value which will
allow user to pass this if condition unintentionally.

Neat: PERF_REG_EXTENDED_MAX is a local variable so it should be in lowercase.
Any specific reason to define it in capital?

Ravi


Re: [v3 13/15] tools/perf: Add perf tools support for extended register capability in powerpc

2020-07-24 Thread Ravi Bangoria

Hi Athira,

On 7/17/20 8:08 PM, Athira Rajeev wrote:

From: Anju T Sudhakar 

Add extended regs to sample_reg_mask in the tool side to use
with `-I?` option. Perf tools side uses extended mask to display
the platform supported register names (with -I? option) to the user
and also send this mask to the kernel to capture the extended registers
in each sample. Hence decide the mask value based on the processor
version.

Currently definitions for `mfspr`, `SPRN_PVR` are part of
`arch/powerpc/util/header.c`. Move this to a header file so that
these definitions can be re-used in other source files as well.


It seems this patch has a regression.

Without this patch:

  $ sudo ./perf record -I
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.458 MB perf.data (318 samples) ]

With this patch:

  $ sudo ./perf record -I
  Error:
  dummy:HG: PMU Hardware doesn't support sampling/overflow-interrupts. Try 
'perf stat'

Ravi


Re: [PATCH 0/7] powerpc/watchpoint: 2nd DAWR kvm enablement + selftests

2020-07-23 Thread Ravi Bangoria




On 7/23/20 3:50 PM, Ravi Bangoria wrote:

Patch #1, #2 and #3 enables p10 2nd DAWR feature for Book3S kvm guest. DAWR
is a hypervisor resource and thus H_SET_MODE hcall is used to set/unset it.
A new case H_SET_MODE_RESOURCE_SET_DAWR1 is introduced in H_SET_MODE hcall
for setting/unsetting 2nd DAWR. Also, new capability KVM_CAP_PPC_DAWR1 has
been added to query 2nd DAWR support via kvm ioctl.

This feature also needs to be enabled in Qemu to really use it. I'll reply
link to qemu patches once I post them in qemu-devel mailing list.


Qemu patches: 
https://lore.kernel.org/kvm/20200723104220.314671-1-ravi.bango...@linux.ibm.com


[PATCH 7/7] powerpc/selftests: Add selftest to test concurrent perf/ptrace events

2020-07-23 Thread Ravi Bangoria
ptrace and perf watchpoints can't co-exists if their address range
overlaps. See commit 29da4f91c0c1 ("powerpc/watchpoint: Don't allow
concurrent perf and ptrace events") for more detail. Add selftest
for the same.

Sample o/p:
  # ./ptrace-perf-hwbreak
  test: ptrace-perf-hwbreak
  tags: git_version:powerpc-5.8-7-118-g937fa174a15d-dirty
  perf cpu event -> ptrace thread event (Overlapping): Ok
  perf cpu event -> ptrace thread event (Non-overlapping): Ok
  perf thread event -> ptrace same thread event (Overlapping): Ok
  perf thread event -> ptrace same thread event (Non-overlapping): Ok
  perf thread event -> ptrace other thread event: Ok
  ptrace thread event -> perf kernel event: Ok
  ptrace thread event -> perf same thread event (Overlapping): Ok
  ptrace thread event -> perf same thread event (Non-overlapping): Ok
  ptrace thread event -> perf other thread event: Ok
  ptrace thread event -> perf cpu event (Overlapping): Ok
  ptrace thread event -> perf cpu event (Non-overlapping): Ok
  ptrace thread event -> perf same thread & cpu event (Overlapping): Ok
  ptrace thread event -> perf same thread & cpu event (Non-overlapping): Ok
  ptrace thread event -> perf other thread & cpu event: Ok
  success: ptrace-perf-hwbreak

Signed-off-by: Ravi Bangoria 
---
 .../selftests/powerpc/ptrace/.gitignore   |   1 +
 .../testing/selftests/powerpc/ptrace/Makefile |   2 +-
 .../powerpc/ptrace/ptrace-perf-hwbreak.c  | 659 ++
 3 files changed, 661 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-perf-hwbreak.c

diff --git a/tools/testing/selftests/powerpc/ptrace/.gitignore 
b/tools/testing/selftests/powerpc/ptrace/.gitignore
index 0e96150b7c7e..eb75e5360e31 100644
--- a/tools/testing/selftests/powerpc/ptrace/.gitignore
+++ b/tools/testing/selftests/powerpc/ptrace/.gitignore
@@ -14,3 +14,4 @@ perf-hwbreak
 core-pkey
 ptrace-pkey
 ptrace-syscall
+ptrace-perf-hwbreak
diff --git a/tools/testing/selftests/powerpc/ptrace/Makefile 
b/tools/testing/selftests/powerpc/ptrace/Makefile
index 8d3f006c98cc..a500639da97a 100644
--- a/tools/testing/selftests/powerpc/ptrace/Makefile
+++ b/tools/testing/selftests/powerpc/ptrace/Makefile
@@ -2,7 +2,7 @@
 TEST_GEN_PROGS := ptrace-gpr ptrace-tm-gpr ptrace-tm-spd-gpr \
   ptrace-tar ptrace-tm-tar ptrace-tm-spd-tar ptrace-vsx 
ptrace-tm-vsx \
   ptrace-tm-spd-vsx ptrace-tm-spr ptrace-hwbreak ptrace-pkey 
core-pkey \
-  perf-hwbreak ptrace-syscall
+  perf-hwbreak ptrace-syscall ptrace-perf-hwbreak
 
 top_srcdir = ../../../../..
 include ../../lib.mk
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-perf-hwbreak.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-perf-hwbreak.c
new file mode 100644
index ..6b8804a4942e
--- /dev/null
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-perf-hwbreak.c
@@ -0,0 +1,659 @@
+// SPDX-License-Identifier: GPL-2.0+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "ptrace.h"
+
+char data[16];
+
+/* Overlapping address range */
+volatile __u64 *ptrace_data1 = (__u64 *)[0];
+volatile __u64 *perf_data1 = (__u64 *)[4];
+
+/* Non-overlapping address range */
+volatile __u64 *ptrace_data2 = (__u64 *)[0];
+volatile __u64 *perf_data2 = (__u64 *)[8];
+
+static unsigned long pid_max_addr(void)
+{
+   FILE *fp;
+   char *line, *c;
+   char addr[100];
+   size_t len = 0;
+
+   fp = fopen("/proc/kallsyms", "r");
+   if (!fp) {
+   printf("Failed to read /proc/kallsyms. Exiting..\n");
+   exit(EXIT_FAILURE);
+   }
+
+   while (getline(, , fp) != -1) {
+   if (!strstr(line, "pid_max") || strstr(line, "pid_max_max") ||
+   strstr(line, "pid_max_min"))
+   continue;
+
+   strncpy(addr, line, len < 100 ? len : 100);
+   c = strchr(addr, ' ');
+   *c = '\0';
+   return strtoul(addr, , 16);
+   }
+   fclose(fp);
+   printf("Could not find pix_max. Exiting..\n");
+   exit(EXIT_FAILURE);
+   return -1;
+}
+
+static void perf_user_event_attr_set(struct perf_event_attr *attr, __u64 addr, 
__u64 len)
+{
+   memset(attr, 0, sizeof(struct perf_event_attr));
+   attr->type   = PERF_TYPE_BREAKPOINT;
+   attr->size   = sizeof(struct perf_event_attr);
+   attr->bp_type= HW_BREAKPOINT_R;
+   attr->bp_addr= addr;
+   attr->bp_len = len;
+   attr->exclude_kernel = 1;
+   attr->exclude_hv = 1;
+}
+
+static void perf_kernel_event_attr_set(struct perf_event_attr *attr)
+{
+   memset(attr, 0, sizeof(struct perf_event_attr));
+   attr->type   = PERF_

[PATCH 6/7] powerpc/selftests/perf-hwbreak: Add testcases for 2nd DAWR

2020-07-23 Thread Ravi Bangoria
Extend perf-hwbreak.c selftest to test multiple DAWRs. Also add
testcase for testing 512 byte boundary removal.

Sample o/p:
  # ./perf-hwbreak
  ...
  TESTED: Process specific, Two events, diff addr
  TESTED: Process specific, Two events, same addr
  TESTED: Process specific, Two events, diff addr, one is RO, other is WO
  TESTED: Process specific, Two events, same addr, one is RO, other is WO
  TESTED: Systemwide, Two events, diff addr
  TESTED: Systemwide, Two events, same addr
  TESTED: Systemwide, Two events, diff addr, one is RO, other is WO
  TESTED: Systemwide, Two events, same addr, one is RO, other is WO
  TESTED: Process specific, 512 bytes, unaligned
  success: perf_hwbreak

Signed-off-by: Ravi Bangoria 
---
 .../selftests/powerpc/ptrace/perf-hwbreak.c   | 568 +-
 1 file changed, 567 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/powerpc/ptrace/perf-hwbreak.c 
b/tools/testing/selftests/powerpc/ptrace/perf-hwbreak.c
index bde475341c8a..5df08738884d 100644
--- a/tools/testing/selftests/powerpc/ptrace/perf-hwbreak.c
+++ b/tools/testing/selftests/powerpc/ptrace/perf-hwbreak.c
@@ -21,8 +21,13 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -34,6 +39,12 @@
 
 #define DAWR_LENGTH_MAX ((0x3f + 1) * 8)
 
+int nprocs;
+
+static volatile int a = 10;
+static volatile int b = 10;
+static volatile char c[512 + 8] __attribute__((aligned(512)));
+
 static void perf_event_attr_set(struct perf_event_attr *attr,
__u32 type, __u64 addr, __u64 len,
bool exclude_user)
@@ -68,6 +79,76 @@ static int perf_process_event_open(__u32 type, __u64 addr, 
__u64 len)
return syscall(__NR_perf_event_open, , getpid(), -1, -1, 0);
 }
 
+static int perf_cpu_event_open(long cpu, __u32 type, __u64 addr, __u64 len)
+{
+   struct perf_event_attr attr;
+
+   perf_event_attr_set(, type, addr, len, 0);
+   return syscall(__NR_perf_event_open, , -1, cpu, -1, 0);
+}
+
+static void close_fds(int *fd, int n)
+{
+   int i;
+
+   for (i = 0; i < n; i++)
+   close(fd[i]);
+}
+
+static unsigned long read_fds(int *fd, int n)
+{
+   int i;
+   unsigned long c = 0;
+   unsigned long count = 0;
+   size_t res;
+
+   for (i = 0; i < n; i++) {
+   res = read(fd[i], , sizeof(c));
+   assert(res == sizeof(unsigned long long));
+   count += c;
+   }
+   return count;
+}
+
+static void reset_fds(int *fd, int n)
+{
+   int i;
+
+   for (i = 0; i < n; i++)
+   ioctl(fd[i], PERF_EVENT_IOC_RESET);
+}
+
+static void enable_fds(int *fd, int n)
+{
+   int i;
+
+   for (i = 0; i < n; i++)
+   ioctl(fd[i], PERF_EVENT_IOC_ENABLE);
+}
+
+static void disable_fds(int *fd, int n)
+{
+   int i;
+
+   for (i = 0; i < n; i++)
+   ioctl(fd[i], PERF_EVENT_IOC_DISABLE);
+}
+
+static int perf_systemwide_event_open(int *fd, __u32 type, __u64 addr, __u64 
len)
+{
+   int i = 0;
+
+   /* Assume online processors are 0 to nprocs for simplisity */
+   for (i = 0; i < nprocs; i++) {
+   fd[i] = perf_cpu_event_open(i, type, addr, len);
+   if (fd[i] < 0) {
+   close_fds(fd, i);
+   return fd[i];
+   }
+   }
+   return 0;
+}
+
 static inline bool breakpoint_test(int len)
 {
int fd;
@@ -261,11 +342,483 @@ static int runtest_dar_outside(void)
return fail;
 }
 
+static void multi_dawr_workload(void)
+{
+   a += 10;
+   b += 10;
+   c[512 + 1] += 'a';
+}
+
+static int test_process_multi_diff_addr(void)
+{
+   unsigned long long breaks1 = 0, breaks2 = 0;
+   int fd1, fd2;
+   char *desc = "Process specific, Two events, diff addr";
+   size_t res;
+
+   fd1 = perf_process_event_open(HW_BREAKPOINT_RW, (__u64), 
(__u64)sizeof(a));
+   if (fd1 < 0) {
+   perror("perf_process_event_open");
+   exit(EXIT_FAILURE);
+   }
+
+   fd2 = perf_process_event_open(HW_BREAKPOINT_RW, (__u64), 
(__u64)sizeof(b));
+   if (fd2 < 0) {
+   close(fd1);
+   perror("perf_process_event_open");
+   exit(EXIT_FAILURE);
+   }
+
+   ioctl(fd1, PERF_EVENT_IOC_RESET);
+   ioctl(fd2, PERF_EVENT_IOC_RESET);
+   ioctl(fd1, PERF_EVENT_IOC_ENABLE);
+   ioctl(fd2, PERF_EVENT_IOC_ENABLE);
+   multi_dawr_workload();
+   ioctl(fd1, PERF_EVENT_IOC_DISABLE);
+   ioctl(fd2, PERF_EVENT_IOC_DISABLE);
+
+   res = read(fd1, , sizeof(breaks1));
+   assert(res == sizeof(unsigned long long));
+   res = read(fd2, , sizeof(breaks2));
+   assert(res == sizeof(unsigned long long));
+
+   close(fd1);
+   close(fd2);
+
+   if (breaks1 != 2 || breaks2 != 2

[PATCH 5/7] powerpc/selftests/perf-hwbreak: Coalesce event creation code

2020-07-23 Thread Ravi Bangoria
perf-hwbreak selftest opens hw-breakpoint event at multiple places for
which it has same code repeated. Coalesce that code into a function.

Signed-off-by: Ravi Bangoria 
---
 .../selftests/powerpc/ptrace/perf-hwbreak.c   | 78 +--
 1 file changed, 38 insertions(+), 40 deletions(-)

diff --git a/tools/testing/selftests/powerpc/ptrace/perf-hwbreak.c 
b/tools/testing/selftests/powerpc/ptrace/perf-hwbreak.c
index c1f324afdbf3..bde475341c8a 100644
--- a/tools/testing/selftests/powerpc/ptrace/perf-hwbreak.c
+++ b/tools/testing/selftests/powerpc/ptrace/perf-hwbreak.c
@@ -34,28 +34,46 @@
 
 #define DAWR_LENGTH_MAX ((0x3f + 1) * 8)
 
-static inline int sys_perf_event_open(struct perf_event_attr *attr, pid_t pid,
- int cpu, int group_fd,
- unsigned long flags)
+static void perf_event_attr_set(struct perf_event_attr *attr,
+   __u32 type, __u64 addr, __u64 len,
+   bool exclude_user)
 {
-   attr->size = sizeof(*attr);
-   return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags);
+   memset(attr, 0, sizeof(struct perf_event_attr));
+   attr->type   = PERF_TYPE_BREAKPOINT;
+   attr->size   = sizeof(struct perf_event_attr);
+   attr->bp_type= type;
+   attr->bp_addr= addr;
+   attr->bp_len = len;
+   attr->exclude_kernel = 1;
+   attr->exclude_hv = 1;
+   attr->exclude_guest  = 1;
+   attr->exclude_user   = exclude_user;
+   attr->disabled   = 1;
 }
 
-static inline bool breakpoint_test(int len)
+static int
+perf_process_event_open_exclude_user(__u32 type, __u64 addr, __u64 len, bool 
exclude_user)
 {
struct perf_event_attr attr;
+
+   perf_event_attr_set(, type, addr, len, exclude_user);
+   return syscall(__NR_perf_event_open, , getpid(), -1, -1, 0);
+}
+
+static int perf_process_event_open(__u32 type, __u64 addr, __u64 len)
+{
+   struct perf_event_attr attr;
+
+   perf_event_attr_set(, type, addr, len, 0);
+   return syscall(__NR_perf_event_open, , getpid(), -1, -1, 0);
+}
+
+static inline bool breakpoint_test(int len)
+{
int fd;
 
-   /* setup counters */
-   memset(, 0, sizeof(attr));
-   attr.disabled = 1;
-   attr.type = PERF_TYPE_BREAKPOINT;
-   attr.bp_type = HW_BREAKPOINT_R;
/* bp_addr can point anywhere but needs to be aligned */
-   attr.bp_addr = (__u64)() & 0xf800;
-   attr.bp_len = len;
-   fd = sys_perf_event_open(, 0, -1, -1, 0);
+   fd = perf_process_event_open(HW_BREAKPOINT_R, (__u64)() & 
0xf800, len);
if (fd < 0)
return false;
close(fd);
@@ -75,7 +93,6 @@ static inline bool dawr_supported(void)
 static int runtestsingle(int readwriteflag, int exclude_user, int arraytest)
 {
int i,j;
-   struct perf_event_attr attr;
size_t res;
unsigned long long breaks, needed;
int readint;
@@ -94,19 +111,11 @@ static int runtestsingle(int readwriteflag, int 
exclude_user, int arraytest)
if (arraytest)
ptr = [0];
 
-   /* setup counters */
-   memset(, 0, sizeof(attr));
-   attr.disabled = 1;
-   attr.type = PERF_TYPE_BREAKPOINT;
-   attr.bp_type = readwriteflag;
-   attr.bp_addr = (__u64)ptr;
-   attr.bp_len = sizeof(int);
-   if (arraytest)
-   attr.bp_len = DAWR_LENGTH_MAX;
-   attr.exclude_user = exclude_user;
-   break_fd = sys_perf_event_open(, 0, -1, -1, 0);
+   break_fd = perf_process_event_open_exclude_user(readwriteflag, 
(__u64)ptr,
+   arraytest ? DAWR_LENGTH_MAX : sizeof(int),
+   exclude_user);
if (break_fd < 0) {
-   perror("sys_perf_event_open");
+   perror("perf_process_event_open_exclude_user");
exit(1);
}
 
@@ -153,7 +162,6 @@ static int runtest_dar_outside(void)
void *target;
volatile __u16 temp16;
volatile __u64 temp64;
-   struct perf_event_attr attr;
int break_fd;
unsigned long long breaks;
int fail = 0;
@@ -165,21 +173,11 @@ static int runtest_dar_outside(void)
exit(EXIT_FAILURE);
}
 
-   /* setup counters */
-   memset(, 0, sizeof(attr));
-   attr.disabled = 1;
-   attr.type = PERF_TYPE_BREAKPOINT;
-   attr.exclude_kernel = 1;
-   attr.exclude_hv = 1;
-   attr.exclude_guest = 1;
-   attr.bp_type = HW_BREAKPOINT_RW;
/* watch middle half of target array */
-   attr.bp_addr = (__u64)(target + 2);
-   attr.bp_len = 4;
-   break_fd = sys_perf_event_open(, 0, -1, -1, 0);
+   break_fd = perf_process_event_open(HW_BREAKPOINT_RW, (__u64)(target + 
2), 4);
if (break_fd < 0) {
   

[PATCH 4/7] powerpc/selftests/ptrace-hwbreak: Add testcases for 2nd DAWR

2020-07-23 Thread Ravi Bangoria
Add selftests to test multiple active DAWRs with ptrace interface.

Sample o/p:
  $ ./ptrace-hwbreak
  ...
  PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DW ALIGNED, WO, len: 6: Ok
  PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DW UNALIGNED, RO, len: 6: Ok
  PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DAWR Overlap, WO, len: 6: Ok
  PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DAWR Overlap, RO, len: 6: Ok

Signed-off-by: Ravi Bangoria 
---
 .../selftests/powerpc/ptrace/ptrace-hwbreak.c | 79 +++
 1 file changed, 79 insertions(+)

diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
index fc477dfe86a2..65781f4035c1 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
@@ -185,6 +185,18 @@ static void test_workload(void)
big_var[rand() % DAWR_MAX_LEN] = 'a';
else
cvar = big_var[rand() % DAWR_MAX_LEN];
+
+   /* PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DW ALIGNED, WO test */
+   gstruct.a[rand() % A_LEN] = 'a';
+
+   /* PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DW UNALIGNED, RO test */
+   cvar = gstruct.b[rand() % B_LEN];
+
+   /* PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DAWR Overlap, WO test */
+   gstruct.a[rand() % A_LEN] = 'a';
+
+   /* PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DAWR Overlap, RO test */
+   cvar = gstruct.a[rand() % A_LEN];
 }
 
 static void check_success(pid_t child_pid, const char *name, const char *type,
@@ -374,6 +386,69 @@ static void test_sethwdebug_range_aligned(pid_t child_pid)
ptrace_delhwdebug(child_pid, wh);
 }
 
+static void test_multi_sethwdebug_range(pid_t child_pid)
+{
+   struct ppc_hw_breakpoint info1, info2;
+   unsigned long wp_addr1, wp_addr2;
+   char *name1 = "PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DW ALIGNED";
+   char *name2 = "PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DW UNALIGNED";
+   int len1, len2;
+   int wh1, wh2;
+
+   wp_addr1 = (unsigned long)
+   wp_addr2 = (unsigned long)
+   len1 = A_LEN;
+   len2 = B_LEN;
+   get_ppc_hw_breakpoint(, PPC_BREAKPOINT_TRIGGER_WRITE, wp_addr1, 
len1);
+   get_ppc_hw_breakpoint(, PPC_BREAKPOINT_TRIGGER_READ, wp_addr2, 
len2);
+
+   /* PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DW ALIGNED, WO test */
+   wh1 = ptrace_sethwdebug(child_pid, );
+
+   /* PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DW UNALIGNED, RO test */
+   wh2 = ptrace_sethwdebug(child_pid, );
+
+   ptrace(PTRACE_CONT, child_pid, NULL, 0);
+   check_success(child_pid, name1, "WO", wp_addr1, len1);
+
+   ptrace(PTRACE_CONT, child_pid, NULL, 0);
+   check_success(child_pid, name2, "RO", wp_addr2, len2);
+
+   ptrace_delhwdebug(child_pid, wh1);
+   ptrace_delhwdebug(child_pid, wh2);
+}
+
+static void test_multi_sethwdebug_range_dawr_overlap(pid_t child_pid)
+{
+   struct ppc_hw_breakpoint info1, info2;
+   unsigned long wp_addr1, wp_addr2;
+   char *name = "PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DAWR Overlap";
+   int len1, len2;
+   int wh1, wh2;
+
+   wp_addr1 = (unsigned long)
+   wp_addr2 = (unsigned long)
+   len1 = A_LEN;
+   len2 = A_LEN;
+   get_ppc_hw_breakpoint(, PPC_BREAKPOINT_TRIGGER_WRITE, wp_addr1, 
len1);
+   get_ppc_hw_breakpoint(, PPC_BREAKPOINT_TRIGGER_READ, wp_addr2, 
len2);
+
+   /* PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DAWR Overlap, WO test */
+   wh1 = ptrace_sethwdebug(child_pid, );
+
+   /* PPC_PTRACE_SETHWDEBUG 2, MODE_RANGE, DAWR Overlap, RO test */
+   wh2 = ptrace_sethwdebug(child_pid, );
+
+   ptrace(PTRACE_CONT, child_pid, NULL, 0);
+   check_success(child_pid, name, "WO", wp_addr1, len1);
+
+   ptrace(PTRACE_CONT, child_pid, NULL, 0);
+   check_success(child_pid, name, "RO", wp_addr2, len2);
+
+   ptrace_delhwdebug(child_pid, wh1);
+   ptrace_delhwdebug(child_pid, wh2);
+}
+
 static void test_sethwdebug_range_unaligned(pid_t child_pid)
 {
struct ppc_hw_breakpoint info;
@@ -460,6 +535,10 @@ run_tests(pid_t child_pid, struct ppc_debug_info *dbginfo, 
bool dawr)
test_sethwdebug_range_unaligned(child_pid);
test_sethwdebug_range_unaligned_dar(child_pid);
test_sethwdebug_dawr_max_range(child_pid);
+   if (dbginfo->num_data_bps > 1) {
+   test_multi_sethwdebug_range(child_pid);
+   
test_multi_sethwdebug_range_dawr_overlap(child_pid);
+   }
}
}
 }
-- 
2.26.2



[PATCH 1/7] powerpc/watchpoint/kvm: Rename current DAWR macros and variables

2020-07-23 Thread Ravi Bangoria
Power10 is introducing second DAWR. Use real register names (with
suffix 0) from ISA for current macros and variables used by kvm.

Signed-off-by: Ravi Bangoria 
---
 Documentation/virt/kvm/api.rst|  4 +--
 arch/powerpc/include/asm/kvm_host.h   |  4 +--
 arch/powerpc/include/uapi/asm/kvm.h   |  4 +--
 arch/powerpc/kernel/asm-offsets.c |  4 +--
 arch/powerpc/kvm/book3s_hv.c  | 32 +++
 arch/powerpc/kvm/book3s_hv_nested.c   |  8 +++---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 20 +++---
 tools/arch/powerpc/include/uapi/asm/kvm.h |  4 +--
 8 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 426f94582b7a..4dc18fe6a2bf 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -2219,8 +2219,8 @@ registers, find a list below:
   PPC KVM_REG_PPC_BESCR   64
   PPC KVM_REG_PPC_TAR 64
   PPC KVM_REG_PPC_DPDES   64
-  PPC KVM_REG_PPC_DAWR64
-  PPC KVM_REG_PPC_DAWRX   64
+  PPC KVM_REG_PPC_DAWR0   64
+  PPC KVM_REG_PPC_DAWRX0  64
   PPC KVM_REG_PPC_CIABR   64
   PPC KVM_REG_PPC_IC  64
   PPC KVM_REG_PPC_VTB 64
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 7e2d061d0445..9aa3854f0e1e 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -582,8 +582,8 @@ struct kvm_vcpu_arch {
u32 ctrl;
u32 dabrx;
ulong dabr;
-   ulong dawr;
-   ulong dawrx;
+   ulong dawr0;
+   ulong dawrx0;
ulong ciabr;
ulong cfar;
ulong ppr;
diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
b/arch/powerpc/include/uapi/asm/kvm.h
index 264e266a85bf..38d61b73f5ed 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -608,8 +608,8 @@ struct kvm_ppc_cpu_char {
 #define KVM_REG_PPC_BESCR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa7)
 #define KVM_REG_PPC_TAR(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa8)
 #define KVM_REG_PPC_DPDES  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xa9)
-#define KVM_REG_PPC_DAWR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xaa)
-#define KVM_REG_PPC_DAWRX  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xab)
+#define KVM_REG_PPC_DAWR0  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xaa)
+#define KVM_REG_PPC_DAWRX0 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xab)
 #define KVM_REG_PPC_CIABR  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xac)
 #define KVM_REG_PPC_IC (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xad)
 #define KVM_REG_PPC_VTB(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xae)
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 6657dc6b2336..e76bffe348e1 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -547,8 +547,8 @@ int main(void)
OFFSET(VCPU_CTRL, kvm_vcpu, arch.ctrl);
OFFSET(VCPU_DABR, kvm_vcpu, arch.dabr);
OFFSET(VCPU_DABRX, kvm_vcpu, arch.dabrx);
-   OFFSET(VCPU_DAWR, kvm_vcpu, arch.dawr);
-   OFFSET(VCPU_DAWRX, kvm_vcpu, arch.dawrx);
+   OFFSET(VCPU_DAWR0, kvm_vcpu, arch.dawr0);
+   OFFSET(VCPU_DAWRX0, kvm_vcpu, arch.dawrx0);
OFFSET(VCPU_CIABR, kvm_vcpu, arch.ciabr);
OFFSET(VCPU_HFLAGS, kvm_vcpu, arch.hflags);
OFFSET(VCPU_DEC, kvm_vcpu, arch.dec);
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 89afcc5f60ca..28200e4f5d27 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -778,8 +778,8 @@ static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, 
unsigned long mflags,
return H_UNSUPPORTED_FLAG_START;
if (value2 & DABRX_HYP)
return H_P4;
-   vcpu->arch.dawr  = value1;
-   vcpu->arch.dawrx = value2;
+   vcpu->arch.dawr0  = value1;
+   vcpu->arch.dawrx0 = value2;
return H_SUCCESS;
case H_SET_MODE_RESOURCE_ADDR_TRANS_MODE:
/* KVM does not support mflags=2 (AIL=2) */
@@ -1724,11 +1724,11 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, 
u64 id,
case KVM_REG_PPC_VTB:
*val = get_reg_val(id, vcpu->arch.vcore->vtb);
break;
-   case KVM_REG_PPC_DAWR:
-   *val = get_reg_val(id, vcpu->arch.dawr);
+   case KVM_REG_PPC_DAWR0:
+   *val = get_reg_val(id, vcpu->arch.dawr0);
break;
-   case KVM_REG_PPC_DAWRX:
-   *val = get_reg_val(id, vcpu->arch.dawrx);
+   case KVM_REG_PPC_DAWRX0:
+   *val = get_reg_val(id, vcpu->arch.dawrx0);
break;
case KVM_REG_PPC_CIABR:
*val = get_reg_val(

[PATCH 3/7] powerpc/watchpoint/kvm: Introduce new capability for 2nd DAWR

2020-07-23 Thread Ravi Bangoria
Introduce KVM_CAP_PPC_DAWR1 which can be used by Qemu to query whether
kvm supports 2nd DAWR or not.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/kvm/powerpc.c | 3 +++
 include/uapi/linux/kvm.h   | 1 +
 2 files changed, 4 insertions(+)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index dd7d141e33e8..f38380fd1fe9 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -676,6 +676,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
!kvmppc_hv_ops->enable_svm(NULL);
break;
 #endif
+   case KVM_CAP_PPC_DAWR1:
+   r = cpu_has_feature(CPU_FTR_DAWR1);
+   break;
default:
r = 0;
break;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 4fdf30316582..2c3713d6526a 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1031,6 +1031,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_PPC_SECURE_GUEST 181
 #define KVM_CAP_HALT_POLL 182
 #define KVM_CAP_ASYNC_PF_INT 183
+#define KVM_CAP_PPC_DAWR1 184
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.26.2



[PATCH 2/7] powerpc/watchpoint/kvm: Add infrastructure to support 2nd DAWR

2020-07-23 Thread Ravi Bangoria
kvm code assumes single DAWR everywhere. Add code to support 2nd DAWR.
DAWR is a hypervisor resource and thus H_SET_MODE hcall is used to set/
unset it. Introduce new case H_SET_MODE_RESOURCE_SET_DAWR1 for 2nd DAWR.
Also, kvm will support 2nd DAWR only if CPU_FTR_DAWR1 is set.

Signed-off-by: Ravi Bangoria 
---
 Documentation/virt/kvm/api.rst|  2 ++
 arch/powerpc/include/asm/hvcall.h |  2 ++
 arch/powerpc/include/asm/kvm_host.h   |  2 ++
 arch/powerpc/include/uapi/asm/kvm.h   |  4 +++
 arch/powerpc/kernel/asm-offsets.c |  2 ++
 arch/powerpc/kvm/book3s_hv.c  | 41 +++
 arch/powerpc/kvm/book3s_hv_nested.c   |  7 
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 23 +
 tools/arch/powerpc/include/uapi/asm/kvm.h |  4 +++
 9 files changed, 87 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 4dc18fe6a2bf..7b1d16c2ad24 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -2242,6 +2242,8 @@ registers, find a list below:
   PPC KVM_REG_PPC_PSSCR   64
   PPC KVM_REG_PPC_DEC_EXPIRY  64
   PPC KVM_REG_PPC_PTCR64
+  PPC KVM_REG_PPC_DAWR1   64
+  PPC KVM_REG_PPC_DAWRX1  64
   PPC KVM_REG_PPC_TM_GPR0 64
   ...
   PPC KVM_REG_PPC_TM_GPR3164
diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index 33793444144c..03f401d7be41 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -538,6 +538,8 @@ struct hv_guest_state {
s64 tb_offset;
u64 dawr0;
u64 dawrx0;
+   u64 dawr1;
+   u64 dawrx1;
u64 ciabr;
u64 hdec_expiry;
u64 purr;
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 9aa3854f0e1e..bda839edd5fe 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -584,6 +584,8 @@ struct kvm_vcpu_arch {
ulong dabr;
ulong dawr0;
ulong dawrx0;
+   ulong dawr1;
+   ulong dawrx1;
ulong ciabr;
ulong cfar;
ulong ppr;
diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
b/arch/powerpc/include/uapi/asm/kvm.h
index 38d61b73f5ed..c5c0f128b46f 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -640,6 +640,10 @@ struct kvm_ppc_cpu_char {
 #define KVM_REG_PPC_ONLINE (KVM_REG_PPC | KVM_REG_SIZE_U32 | 0xbf)
 #define KVM_REG_PPC_PTCR   (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc0)
 
+/* POWER10 registers. */
+#define KVM_REG_PPC_DAWR1  (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc1)
+#define KVM_REG_PPC_DAWRX1 (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0xc2)
+
 /* Transactional Memory checkpointed state:
  * This is all GPRs, all VSX regs and a subset of SPRs
  */
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index e76bffe348e1..ef2c0f3f5a7b 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -549,6 +549,8 @@ int main(void)
OFFSET(VCPU_DABRX, kvm_vcpu, arch.dabrx);
OFFSET(VCPU_DAWR0, kvm_vcpu, arch.dawr0);
OFFSET(VCPU_DAWRX0, kvm_vcpu, arch.dawrx0);
+   OFFSET(VCPU_DAWR1, kvm_vcpu, arch.dawr1);
+   OFFSET(VCPU_DAWRX1, kvm_vcpu, arch.dawrx1);
OFFSET(VCPU_CIABR, kvm_vcpu, arch.ciabr);
OFFSET(VCPU_HFLAGS, kvm_vcpu, arch.hflags);
OFFSET(VCPU_DEC, kvm_vcpu, arch.dec);
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 28200e4f5d27..24575520b2ea 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -781,6 +781,20 @@ static int kvmppc_h_set_mode(struct kvm_vcpu *vcpu, 
unsigned long mflags,
vcpu->arch.dawr0  = value1;
vcpu->arch.dawrx0 = value2;
return H_SUCCESS;
+   case H_SET_MODE_RESOURCE_SET_DAWR1:
+   if (!kvmppc_power8_compatible(vcpu))
+   return H_P2;
+   if (!ppc_breakpoint_available())
+   return H_P2;
+   if (!cpu_has_feature(CPU_FTR_DAWR1))
+   return H_P2;
+   if (mflags)
+   return H_UNSUPPORTED_FLAG_START;
+   if (value2 & DABRX_HYP)
+   return H_P4;
+   vcpu->arch.dawr1  = value1;
+   vcpu->arch.dawrx1 = value2;
+   return H_SUCCESS;
case H_SET_MODE_RESOURCE_ADDR_TRANS_MODE:
/* KVM does not support mflags=2 (AIL=2) */
if (mflags != 0 && mflags != 3)
@@ -1730,6 +1744,12 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, 
u64 id,
case KVM_REG_PPC_DAWRX0:
*val = get_reg_val(id, vcpu->arch.dawrx0);
break;
+   case KVM_REG_PPC_DAWR1:
+ 

[PATCH 0/7] powerpc/watchpoint: 2nd DAWR kvm enablement + selftests

2020-07-23 Thread Ravi Bangoria
Patch #1, #2 and #3 enables p10 2nd DAWR feature for Book3S kvm guest. DAWR
is a hypervisor resource and thus H_SET_MODE hcall is used to set/unset it.
A new case H_SET_MODE_RESOURCE_SET_DAWR1 is introduced in H_SET_MODE hcall
for setting/unsetting 2nd DAWR. Also, new capability KVM_CAP_PPC_DAWR1 has
been added to query 2nd DAWR support via kvm ioctl.

This feature also needs to be enabled in Qemu to really use it. I'll reply
link to qemu patches once I post them in qemu-devel mailing list.

Patch #4, #5, #6 and #7 adds selftests to test 2nd DAWR.

Dependency:
  1: p10 kvm base enablement
 
https://lore.kernel.org/linuxppc-dev/20200602055325.6102-1-alist...@popple.id.au

  2: 2nd DAWR powervm/baremetal enablement
 
https://lore.kernel.org/linuxppc-dev/20200723090813.303838-1-ravi.bango...@linux.ibm.com

  3: ptrace PPC_DEBUG_FEATURE_DATA_BP_DAWR_ARCH_31 flag
 
https://lore.kernel.org/linuxppc-dev/20200723093330.306341-1-ravi.bango...@linux.ibm.com

Patches in this series applies fine on top of powerpc/next (9a77c4a0a125)
plus above dependency patches.

Ravi Bangoria (7):
  powerpc/watchpoint/kvm: Rename current DAWR macros and variables
  powerpc/watchpoint/kvm: Add infrastructure to support 2nd DAWR
  powerpc/watchpoint/kvm: Introduce new capability for 2nd DAWR
  powerpc/selftests/ptrace-hwbreak: Add testcases for 2nd DAWR
  powerpc/selftests/perf-hwbreak: Coalesce event creation code
  powerpc/selftests/perf-hwbreak: Add testcases for 2nd DAWR
  powerpc/selftests: Add selftest to test concurrent perf/ptrace events

 Documentation/virt/kvm/api.rst|   6 +-
 arch/powerpc/include/asm/hvcall.h |   2 +
 arch/powerpc/include/asm/kvm_host.h   |   6 +-
 arch/powerpc/include/uapi/asm/kvm.h   |   8 +-
 arch/powerpc/kernel/asm-offsets.c |   6 +-
 arch/powerpc/kvm/book3s_hv.c  |  73 +-
 arch/powerpc/kvm/book3s_hv_nested.c   |  15 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   |  43 +-
 arch/powerpc/kvm/powerpc.c|   3 +
 include/uapi/linux/kvm.h  |   1 +
 tools/arch/powerpc/include/uapi/asm/kvm.h |   8 +-
 .../selftests/powerpc/ptrace/.gitignore   |   1 +
 .../testing/selftests/powerpc/ptrace/Makefile |   2 +-
 .../selftests/powerpc/ptrace/perf-hwbreak.c   | 646 +++--
 .../selftests/powerpc/ptrace/ptrace-hwbreak.c |  79 +++
 .../powerpc/ptrace/ptrace-perf-hwbreak.c  | 659 ++
 16 files changed, 1476 insertions(+), 82 deletions(-)
 create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-perf-hwbreak.c

-- 
2.26.2



[PATCH v2] powerpc/watchpoint/ptrace: Introduce PPC_DEBUG_FEATURE_DATA_BP_DAWR_ARCH_31

2020-07-23 Thread Ravi Bangoria
PPC_DEBUG_FEATURE_DATA_BP_DAWR_ARCH_31 can be used to determine
whether we are running on an ISA 3.1 compliant machine. Which is
needed to determine DAR behaviour, 512 byte boundary limit etc.
This was requested by Pedro Miraglia Franco de Carvalho for
extending watchpoint features in gdb. Note that availability of
2nd DAWR is independent of this flag and should be checked using
ppc_debug_info->num_data_bps.

Signed-off-by: Ravi Bangoria 
---
v1->v2:
 - Mention new flag in Documentaion/ as well.

 Documentation/powerpc/ptrace.rst  | 1 +
 arch/powerpc/include/uapi/asm/ptrace.h| 1 +
 arch/powerpc/kernel/ptrace/ptrace-noadv.c | 5 -
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/Documentation/powerpc/ptrace.rst b/Documentation/powerpc/ptrace.rst
index 864d4b61..4d42290248cb 100644
--- a/Documentation/powerpc/ptrace.rst
+++ b/Documentation/powerpc/ptrace.rst
@@ -46,6 +46,7 @@ features will have bits indicating whether there is support 
for::
   #define PPC_DEBUG_FEATURE_DATA_BP_RANGE  0x4
   #define PPC_DEBUG_FEATURE_DATA_BP_MASK   0x8
   #define PPC_DEBUG_FEATURE_DATA_BP_DAWR   0x10
+  #define PPC_DEBUG_FEATURE_DATA_BP_DAWR_ARCH_31   0x20
 
 2. PTRACE_SETHWDEBUG
 
diff --git a/arch/powerpc/include/uapi/asm/ptrace.h 
b/arch/powerpc/include/uapi/asm/ptrace.h
index f5f1ccc740fc..0a87bcd4300a 100644
--- a/arch/powerpc/include/uapi/asm/ptrace.h
+++ b/arch/powerpc/include/uapi/asm/ptrace.h
@@ -222,6 +222,7 @@ struct ppc_debug_info {
 #define PPC_DEBUG_FEATURE_DATA_BP_RANGE0x0004
 #define PPC_DEBUG_FEATURE_DATA_BP_MASK 0x0008
 #define PPC_DEBUG_FEATURE_DATA_BP_DAWR 0x0010
+#define PPC_DEBUG_FEATURE_DATA_BP_DAWR_ARCH_31 0x0020
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/powerpc/kernel/ptrace/ptrace-noadv.c 
b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
index 697c7e4b5877..b2de874d650b 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-noadv.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-noadv.c
@@ -52,8 +52,11 @@ void ppc_gethwdinfo(struct ppc_debug_info *dbginfo)
dbginfo->sizeof_condition = 0;
if (IS_ENABLED(CONFIG_HAVE_HW_BREAKPOINT)) {
dbginfo->features = PPC_DEBUG_FEATURE_DATA_BP_RANGE;
-   if (dawr_enabled())
+   if (dawr_enabled()) {
dbginfo->features |= PPC_DEBUG_FEATURE_DATA_BP_DAWR;
+   if (cpu_has_feature(CPU_FTR_ARCH_31))
+   dbginfo->features |= 
PPC_DEBUG_FEATURE_DATA_BP_DAWR_ARCH_31;
+   }
} else {
dbginfo->features = 0;
}
-- 
2.26.2



  1   2   3   4   5   6   >