Re: [PATCH v2 1/3] powerpc/eeh: Ignore error handlers in eeh_pe_reset_and_recover()

2016-04-25 Thread David Gibson
On Fri, Apr 22, 2016 at 11:28:02PM +1000, Gavin Shan wrote:
> The function eeh_pe_reset_and_recover() is used to recover EEH
> error when the passthrough device are transferred to guest and
> backwards, meaning the device's driver is vfio-pci or none.
> When the driver is vfio-pci that provides error_detected() error
> handler only, the handler simply stops the guest and it's not
> expected behaviour. On the other hand, no error handlers will
> be called if we don't have a bound driver.
> 
> This ignores all error handlers provided by device driver in
> eeh_pe_reset_and_recover() to avoid the exceptional behaviour.
> 
> Fixes: 5cfb20b9 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
> Cc: sta...@vger.kernel.org #v3.18+
> Signed-off-by: Gavin Shan 
> Reviewed-by: Russell Currey 
> ---
>  arch/powerpc/kernel/eeh_driver.c | 11 +--
>  1 file changed, 1 insertion(+), 10 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/eeh_driver.c 
> b/arch/powerpc/kernel/eeh_driver.c
> index fb6207d..1c7d703 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -552,7 +552,7 @@ static int eeh_clear_pe_frozen_state(struct eeh_pe *pe,
>  
>  int eeh_pe_reset_and_recover(struct eeh_pe *pe)
>  {
> - int result, ret;
> + int ret;
>  
>   /* Bail if the PE is being recovered */
>   if (pe->state & EEH_PE_RECOVERING)
> @@ -564,9 +564,6 @@ int eeh_pe_reset_and_recover(struct eeh_pe *pe)
>   /* Save states */
>   eeh_pe_dev_traverse(pe, eeh_dev_save_state, NULL);
>  
> - /* Report error */
> - eeh_pe_dev_traverse(pe, eeh_report_error, );

Ok, so after chatting to Gavin, I've made sense of this.  The basic
thing here is that eeh_pe_reset_and_recover() should be discarding any
errors from before the reset, not reporting them - the whole point is
that we know things have gone bad, and we want to clear back to a good
state.

>   /* Issue reset */
>   ret = eeh_reset_pe(pe);
>   if (ret) {
> @@ -581,15 +578,9 @@ int eeh_pe_reset_and_recover(struct eeh_pe *pe)
>   return ret;
>   }
>  
> - /* Notify completion of reset */
> - eeh_pe_dev_traverse(pe, eeh_report_reset, );

However, it's not clear if removing the report of a reset makes sense.
There are no current users of reset notification IIUC, but if we're
going to remove the reset reporting, we should put that in a separate
patch with its own justification, and remove the other caller as well.

>   /* Restore device state */
>   eeh_pe_dev_traverse(pe, eeh_dev_restore_state, NULL);
>  
> - /* Resume */
> - eeh_pe_dev_traverse(pe, eeh_report_resume, NULL);

And I'm not sure if it makes sense to remove the resume notification either.

>   /* Clear recovery mode */
>   eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
>  

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC v6 04/10] PCI: Add support for enforcing all MMIO BARs to be page aligned

2016-04-25 Thread Alexey Kardashevskiy

On 04/18/2016 08:56 PM, Yongji Xie wrote:

When vfio passthrough a PCI device of which MMIO BARs are
smaller than PAGE_SIZE, guest will not handle the mmio
accesses to the BARs which leads to mmio emulations in host.

This is because vfio will not allow to passthrough one BAR's
mmio page which may be shared with other BARs. Otherwise,
there will be a backdoor that guest can use to access BARs
of other guest.

To solve this issue, this patch modifies resource_alignment
to support syntax where multiple devices get the same
alignment. So we can use something like
"pci=resource_alignment=*:*:*.*:noresize" to enforce the
alignment of all MMIO BARs to be at least PAGE_SIZE so that
one BAR's mmio page would not be shared with other BARs.

And we also define a macro PCIBIOS_MIN_ALIGNMENT to enable this
automatically on PPC64 platform which can easily hit this issue
because its PAGE_SIZE is 64KB.

Signed-off-by: Yongji Xie 
---
  Documentation/kernel-parameters.txt |2 ++
  arch/powerpc/include/asm/pci.h  |2 ++
  drivers/pci/pci.c   |   64 +--
  3 files changed, 57 insertions(+), 11 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index d8b29ab..542be4a 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2918,6 +2918,8 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
aligned memory resources.
If  is not specified,
PAGE_SIZE is used as alignment.
+   , ,  and  can be set to
+   "*" which means match all values.
PCI-PCI bridge can be specified, if resource
windows need to be expanded.
noresize: Don't change the resources' sizes when
diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
index 6f8065a..78f230f 100644
--- a/arch/powerpc/include/asm/pci.h
+++ b/arch/powerpc/include/asm/pci.h
@@ -30,6 +30,8 @@
  #define PCIBIOS_MIN_IO0x1000
  #define PCIBIOS_MIN_MEM   0x1000

+#define PCIBIOS_MIN_ALIGNMENT  PAGE_SIZE
+
  struct pci_dev;

  /* Values for the `which' argument to sys_pciconfig_iobase syscall.  */
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 7564ccc..0381c28 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4605,7 +4605,12 @@ static resource_size_t 
pci_specified_resource_alignment(struct pci_dev *dev,
int seg, bus, slot, func, align_order, count;
resource_size_t align = 0;
char *p;
+   bool invalid = false;

+#ifdef PCIBIOS_MIN_ALIGNMENT
+   align = PCIBIOS_MIN_ALIGNMENT;
+   *resize = false;
+#endif
spin_lock(_alignment_lock);
p = resource_alignment_param;
while (*p) {
@@ -4622,16 +4627,49 @@ static resource_size_t 
pci_specified_resource_alignment(struct pci_dev *dev,
} else {
align_order = -1;
}
-   if (sscanf(p, "%x:%x:%x.%x%n",
-   , , , , ) != 4) {



I'd replace the above lines with:

char segstr[5] = "*", busstr[3] = "*";
char slotstr[3] = "*", funstr[2] = "*";

if (sscanf(p, "%4[^:]:%2[^:]:%2[^.].%1s%n",
, , , , ) != 4) {


and add some wrapper like:

static bool glob_match_hex(char const *pat, int val)
{
char valstr[5]; /* 5 should be enough for PCI */
snprintf(valstr, sizeof(valstr) - 1, "%4x", val);
return glob_match(pat, valstr);
}

and then use glob_match_hex() (or make a wrapper like above on top of 
fnmatch()), this would enable better mask handling.


If anyone finds this useful (which I am not sure about).





+   if (p[0] == '*' && p[1] == ':') {
+   seg = -1;
+   count = 1;
+   } else if (sscanf(p, "%x%n", , ) != 1 ||
+   p[count] != ':') {
+   invalid = true;
+   break;
+   }
+   p += count + 1;
+   if (*p == '*') {
+   bus = -1;
+   count = 1;
+   } else if (sscanf(p, "%x%n", , ) != 1) {
+   invalid = true;
+   break;
+   }
+   p += count;
+   if (*p == '.') {
+   slot = bus;
+   bus = seg;
seg = 0;
-   if (sscanf(p, "%x:%x.%x%n",
-   , , , ) != 3) {
-   /* Invalid format */
-   printk(KERN_ERR "PCI: Can't parse resource_alignment 
parameter: %s\n",
-   

Re: [PATCH] powerpc/eeh: fix misleading indentation

2016-04-25 Thread Russell Currey
On Tue, 2016-04-26 at 15:02 +1000, Andrew Donnellan wrote:
> Found by smatch.
> 
> Signed-off-by: Andrew Donnellan 

Acked-by: Russell Currey 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/eeh: fix misleading indentation

2016-04-25 Thread Andrew Donnellan
Found by smatch.

Signed-off-by: Andrew Donnellan 
---
 arch/powerpc/kernel/eeh_pe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
index eea48d8..f0520da 100644
--- a/arch/powerpc/kernel/eeh_pe.c
+++ b/arch/powerpc/kernel/eeh_pe.c
@@ -249,7 +249,7 @@ static void *__eeh_pe_get(void *data, void *flag)
} else {
if (edev->pe_config_addr &&
(edev->pe_config_addr == pe->addr))
-   return pe;
+   return pe;
}
 
/* Try BDF address */
-- 
Andrew Donnellan  OzLabs, ADL Canberra
andrew.donnel...@au1.ibm.com  IBM Australia Limited

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc: disable sparse for lib/xor_vmx.c

2016-04-25 Thread Daniel Axtens
Sparse doesn't seem to be passing -maltivec around properly, leading
to lots of errors:

.../include/altivec.h:34:2: error: Use the "-maltivec" flag to enable PowerPC 
AltiVec support
arch/powerpc/lib/xor_vmx.c:27:16: error: Expected ; at end of declaration
arch/powerpc/lib/xor_vmx.c:27:16: error: got signed
arch/powerpc/lib/xor_vmx.c:60:9: error: No right hand side of '*'-expression
arch/powerpc/lib/xor_vmx.c:60:9: error: Expected ; at end of statement
arch/powerpc/lib/xor_vmx.c:60:9: error: got v1_in
...
arch/powerpc/lib/xor_vmx.c:87:9: error: too many errors

Disable sparse checking for xor_vmx.c.

Signed-off-by: Daniel Axtens 
---
 arch/powerpc/lib/xor_vmx.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/lib/xor_vmx.c b/arch/powerpc/lib/xor_vmx.c
index 07f49f1568e5..eccf37db9512 100644
--- a/arch/powerpc/lib/xor_vmx.c
+++ b/arch/powerpc/lib/xor_vmx.c
@@ -17,6 +17,16 @@
  *
  * Author: Anton Blanchard 
  */
+
+/*
+ * Sparse (as at v0.5.0) gets very, very confused by this file.
+ * Just disable it.
+ */
+#ifdef __CHECKER__
+#undef __CHECKER__
+#warning "Sparse checking disabled for this file"
+#endif
+
 #include 
 
 #include 
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2 55/68] powerpc/mm: VMALLOC abstraction

2016-04-25 Thread Balbir Singh


On 09/04/16 16:13, Aneesh Kumar K.V wrote:
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/include/asm/book3s/64/hash.h| 14 +++---
>  arch/powerpc/include/asm/book3s/64/pgtable.h | 15 ---
>  arch/powerpc/include/asm/book3s/64/radix.h   | 21 +
>  arch/powerpc/kernel/pci_64.c |  3 ++-
>  arch/powerpc/mm/hash_utils_64.c  |  8 
>  arch/powerpc/mm/pgtable-radix.c  |  7 +++
>  arch/powerpc/mm/pgtable_64.c | 13 +++--
>  arch/powerpc/mm/slb_low.S|  2 +-
>  8 files changed, 69 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
> b/arch/powerpc/include/asm/book3s/64/hash.h
> index 43bd7d15f41e..9da410ea7e1a 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash.h
> @@ -45,17 +45,17 @@
>  /*
>   * Define the address range of the kernel non-linear virtual area
>   */
> -#define KERN_VIRT_START ASM_CONST(0xD000)
> -#define KERN_VIRT_SIZE   ASM_CONST(0x1000)
> +#define H_KERN_VIRT_START ASM_CONST(0xD000)
> +#define H_KERN_VIRT_SIZE ASM_CONST(0x1000)
>  
>  /*
>   * The vmalloc space starts at the beginning of that region, and
>   * occupies half of it on hash CPUs and a quarter of it on Book3E
>   * (we keep a quarter for the virtual memmap)
>   */
> -#define VMALLOC_STARTKERN_VIRT_START
> -#define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1)
> -#define VMALLOC_END  (VMALLOC_START + VMALLOC_SIZE)
> +#define H_VMALLOC_START  H_KERN_VIRT_START
> +#define H_VMALLOC_SIZE   (H_KERN_VIRT_SIZE >> 1)
> +#define H_VMALLOC_END(H_VMALLOC_START + H_VMALLOC_SIZE)
>  
>  /*
>   * Region IDs
> @@ -64,7 +64,7 @@
>  #define REGION_MASK  (0xfUL << REGION_SHIFT)
>  #define REGION_ID(ea)(((unsigned long)(ea)) >> REGION_SHIFT)
>  
> -#define VMALLOC_REGION_ID(REGION_ID(VMALLOC_START))
> +#define VMALLOC_REGION_ID(REGION_ID(H_VMALLOC_START))
>  #define KERNEL_REGION_ID (REGION_ID(PAGE_OFFSET))
>  #define VMEMMAP_REGION_ID(0xfUL) /* Server only */
>  #define USER_REGION_ID   (0UL)
> @@ -73,7 +73,7 @@
>   * Defines the address of the vmemap area, in its own region on
>   * hash table CPUs.
>   */
> -#define VMEMMAP_BASE (VMEMMAP_REGION_ID << REGION_SHIFT)
> +#define H_VMEMMAP_BASE   (VMEMMAP_REGION_ID << REGION_SHIFT)
>  
>  #ifdef CONFIG_PPC_MM_SLICES
>  #define HAVE_ARCH_UNMAPPED_AREA
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
> b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index b8ee70458bae..87519ad1c5dc 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -208,6 +208,18 @@ extern unsigned long __pgd_val_bits;
>  #define PUD_MASKED_BITS  0xc0ffUL
>  /* Bits to mask out from a PGD to get to the PUD page */
>  #define PGD_MASKED_BITS  0xc0ffUL
> +
> +extern unsigned long __vmalloc_start;
> +extern unsigned long __vmalloc_end;
> +#define VMALLOC_START__vmalloc_start
> +#define VMALLOC_END  __vmalloc_end
> +
> +extern unsigned long __kernel_virt_start;
> +extern unsigned long __kernel_virt_size;
> +#define KERN_VIRT_START __kernel_virt_start
> +#define KERN_VIRT_SIZE  __kernel_virt_size
> +extern struct page *vmemmap;
> +extern unsigned long ioremap_bot;
>  #endif /* __ASSEMBLY__ */
>  
>  #include 
> @@ -220,7 +232,6 @@ extern unsigned long __pgd_val_bits;
>  #endif
>  
>  #include 
> -
>  /*
>   * The second half of the kernel virtual space is used for IO mappings,
>   * it's itself carved into the PIO region (ISA and PHB IO space) and
> @@ -239,8 +250,6 @@ extern unsigned long __pgd_val_bits;
>  #define IOREMAP_BASE (PHB_IO_END)
>  #define IOREMAP_END  (KERN_VIRT_START + KERN_VIRT_SIZE)
>  
> -#define vmemmap  ((struct page *)VMEMMAP_BASE)
> -
>  /* Advertise special mapping type for AGP */
>  #define HAVE_PAGE_AGP
>  
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h 
> b/arch/powerpc/include/asm/book3s/64/radix.h
> index 040c4a56d07b..d0449c0f2166 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -30,6 +30,27 @@
>  #define R_PGTABLE_EADDR_SIZE (R_PTE_INDEX_SIZE + R_PMD_INDEX_SIZE +  \
> R_PUD_INDEX_SIZE + R_PGD_INDEX_SIZE + PAGE_SHIFT)
>  #define R_PGTABLE_RANGE (ASM_CONST(1) << R_PGTABLE_EADDR_SIZE)
> +/*
> + * We support 52 bit address space, Use top bit for kernel
> + * virtual mapping. Also make sure kernel fit in the top
> + * quadrant.
> + */
> +#define R_KERN_VIRT_START ASM_CONST(0xc008)
> +#define R_KERN_VIRT_SIZE  ASM_CONST(0x0008)
> +
> +/*
> + * The vmalloc space starts at the beginning of that region, and
> + * occupies a quarter of it 

RE: [v8, 6/7] MAINTAINERS: add entry for Freescale SoC specific driver

2016-04-25 Thread Yangbo Lu
Hi Scott and Leo,


> -Original Message-
> From: linux-mmc-ow...@vger.kernel.org [mailto:linux-mmc-
> ow...@vger.kernel.org] On Behalf Of Scott Wood
> Sent: Saturday, April 23, 2016 7:23 AM
> To: Yangbo Lu; linux-...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org;
> linux-arm-ker...@lists.infradead.org; linux-ker...@vger.kernel.org
> Cc: ulf.hans...@linaro.org; Zhao Qiang; Russell King; Bhupesh Sharma;
> Scott Wood; Claudiu Manoil; Kumar Gala; Yang-Leo Li; Xiaobo Xie; Michael
> Ellerman
> Subject: Re: [v8, 6/7] MAINTAINERS: add entry for Freescale SoC specific
> driver
> 
> On Fri, 2016-04-22 at 14:27 +0800, Yangbo Lu wrote:
> > Add maintainer entry for Freescale SoC specific driver including the
> > QE library and the GUTS driver. Also add entry for GUTS driver and add
> > maintainer for QE library.
> >
> > Signed-off-by: Yangbo Lu 
> > ---
> > Changes for v8:
> > - Added this patch
> > ---
> >  MAINTAINERS | 16 +++-
> >  1 file changed, 15 insertions(+), 1 deletion(-)
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS index 1d5b4be..d20aeb6 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -4622,13 +4622,27 @@ F:  drivers/net/ethernet/freescale/fec_ptp.c
> >  F: drivers/net/ethernet/freescale/fec.h
> >  F: Documentation/devicetree/bindings/net/fsl-fec.txt
> >
> > +FREESCALE SOC SPECIFIC DRIVER
> 
> FREESCALE SOC DRIVERS

[Lu Yangbo-B47093] Ok, will change it to 'FREESCALE SOC DRIVERS'.

> 
> > +M: Scott Wood 
> 
> Please CC me at this address, not the NXP address that you sent this to...

[Lu Yangbo-B47093] Sorry for mistaking your email. Will send the one you said.

> 
> > +L: linuxppc-dev@lists.ozlabs.org
> > +S: Maintained
> > +F: drivers/soc/fsl/
> > +F: include/linux/fsl/
> 
> This directory will contain drivers that work on PPC and ARM... I'm not
> sure what to put here in terms of mailing lists (we could put both, but
> people probably shouldn't spam both lists if only one arch is relevant to
> the patch), and whom to make pull requests to.

[Lu Yangbo-B47093] But sooner or later we need to do this for files in fsl/ ...
:)

Hi Leo, do you have any idea?


> 
> > +FREESCALE GUTS DRIVER
> > +M: Yangbo Lu 
> > +L: linuxppc-dev@lists.ozlabs.org
> > +S: Maintained
> > +F: drivers/soc/fsl/guts.c
> 
> What about the header?
> 
> Does guts really need a separate maintainer from drivers/soc/fsl?

[Lu Yangbo-B47093] I was hesitating to add the maintainer...
And I was not so familiar with it. Let me leave it to the fsl/ directory and 
remove this...

> 
> -Scott
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majord...@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2 54/68] powerpc/mm: Update pte filter for radix

2016-04-25 Thread Balbir Singh


On 09/04/16 16:13, Aneesh Kumar K.V wrote:
> ---
>  arch/powerpc/mm/pgtable.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
> index 115a0a19d5a2..0a9658fbf8b9 100644
> --- a/arch/powerpc/mm/pgtable.c
> +++ b/arch/powerpc/mm/pgtable.c
> @@ -82,6 +82,9 @@ static struct page *maybe_pte_to_page(pte_t pte)
>  
>  static pte_t set_pte_filter(pte_t pte)
>  {
> + if (radix_enabled())
> + return pte;
> +
>   pte = __pte(pte_val(pte) & ~_PAGE_HPTEFLAGS);
>   if (pte_looks_normal(pte) && !(cpu_has_feature(CPU_FTR_COHERENT_ICACHE) 
> ||
>  cpu_has_feature(CPU_FTR_NOEXECUTE))) {
> 

Acked-by: Balbir Singh 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2 53/68] powerpc/mm: Add radix pgalloc details

2016-04-25 Thread Balbir Singh


On 09/04/16 16:13, Aneesh Kumar K.V wrote:
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/include/asm/book3s/64/pgalloc.h | 34 
> 
>  arch/powerpc/include/asm/book3s/64/pgtable.h | 10 ++--
>  arch/powerpc/mm/hash_utils_64.c  |  7 ++
>  arch/powerpc/mm/pgtable-radix.c  |  5 +++-
>  arch/powerpc/mm/pgtable_64.c |  6 +
>  5 files changed, 55 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h 
> b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> index faad1319ba26..a282674b2378 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> @@ -50,19 +50,45 @@ extern void pgtable_free_tlb(struct mmu_gather *tlb, void 
> *table, int shift);
>  extern void __tlb_remove_table(void *_table);
>  #endif
>  
> +static inline pgd_t *rpgd_alloc(struct mm_struct *mm)
> +{
> +#ifdef CONFIG_PPC_64K_PAGES
> + return (pgd_t *)__get_free_page(PGALLOC_GFP);
> +#else
> + struct page *page;
> + page = alloc_pages(PGALLOC_GFP, 4);

We need a 2^4*4k = 64k page for PGD even for 4k pages? Why? One concern is that 
PGALLOC_GFP does not use CMA if I read the PGALLOC_GFP flags correct

> + if (!page)
> + return NULL;
> + return (pgd_t *) page_address(page);
> +#endif
> +}
> +
> +static inline void rpgd_free(struct mm_struct *mm, pgd_t *pgd)
> +{
> +#ifdef CONFIG_PPC_64K_PAGES
> + free_page((unsigned long)pgd);
> +#else
> + free_pages((unsigned long)pgd, 4);
> +#endif
> +}
> +
>  static inline pgd_t *pgd_alloc(struct mm_struct *mm)
>  {
> + if (radix_enabled())
> + return rpgd_alloc(mm);
>   return kmem_cache_alloc(PGT_CACHE(PGD_INDEX_SIZE), GFP_KERNEL);
>  }
>  
>  static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
>  {
> + if (radix_enabled())
> + return rpgd_free(mm, pgd);
>   kmem_cache_free(PGT_CACHE(PGD_INDEX_SIZE), pgd);
>  }
>  
>  static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
>  {
> - pgd_set(pgd, __pgtable_ptr_val(pud));
> + pgd_set(pgd, __pgtable_ptr_val(pud) | PGD_VAL_BITS);
>  }
>  
>  static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
> @@ -78,7 +104,7 @@ static inline void pud_free(struct mm_struct *mm, pud_t 
> *pud)
>  
>  static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
>  {
> - pud_set(pud, __pgtable_ptr_val(pmd));
> + pud_set(pud, __pgtable_ptr_val(pmd) | PUD_VAL_BITS);
>  }
>  
>  static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
> @@ -107,13 +133,13 @@ static inline void __pmd_free_tlb(struct mmu_gather 
> *tlb, pmd_t *pmd,
>  static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
>  pte_t *pte)
>  {
> - pmd_set(pmd, __pgtable_ptr_val(pte));
> + pmd_set(pmd, __pgtable_ptr_val(pte) | PMD_VAL_BITS);
>  }
>  
>  static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
>   pgtable_t pte_page)
>  {
> - pmd_set(pmd, __pgtable_ptr_val(pte_page));
> + pmd_set(pmd, __pgtable_ptr_val(pte_page) | PMD_VAL_BITS);
>  }
>  
>  static inline pgtable_t pmd_pgtable(pmd_t pmd)
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
> b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index c16037116625..b8ee70458bae 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -170,11 +170,17 @@ extern unsigned long __pgd_table_size;
>  #define PMD_TABLE_SIZE   __pmd_table_size
>  #define PUD_TABLE_SIZE   __pud_table_size
>  #define PGD_TABLE_SIZE   __pgd_table_size
> +
> +extern unsigned long __pmd_val_bits;
> +extern unsigned long __pud_val_bits;
> +extern unsigned long __pgd_val_bits;
> +#define PMD_VAL_BITS __pmd_val_bits
> +#define PUD_VAL_BITS __pud_val_bits
> +#define PGD_VAL_BITS __pgd_val_bits
>  /*
>   * Pgtable size used by swapper, init in asm code
> - * We will switch this later to radix PGD
>   */
> -#define MAX_PGD_TABLE_SIZE (sizeof(pgd_t) << H_PGD_INDEX_SIZE)
> +#define MAX_PGD_TABLE_SIZE (sizeof(pgd_t) << R_PGD_INDEX_SIZE)
>  
>  #define PTRS_PER_PTE (1 << PTE_INDEX_SIZE)
>  #define PTRS_PER_PMD (1 << PMD_INDEX_SIZE)
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index be5d123b3f61..aef691b75784 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -878,6 +878,13 @@ void __init hlearly_init_mmu(void)
>   __pmd_table_size = H_PMD_TABLE_SIZE;
>   __pud_table_size = H_PUD_TABLE_SIZE;
>   __pgd_table_size = H_PGD_TABLE_SIZE;
> + /*
> +  * 4k use hugepd format, so for hash set then to
^ them
> +  * zero
> +  */

The comment is not very clear

> + __pmd_val_bits = 0;
> +   

Re: [PATCH V2 52/68] powerpc/mm: make 4k and 64k use pte_t for pgtable_t

2016-04-25 Thread Balbir Singh


On 09/04/16 16:13, Aneesh Kumar K.V wrote:
> pgtable_page_dtor for nohash is now moved to pte_fragment_free_mm()
> 
> Signed-off-by: Aneesh Kumar K.V 

This needs a better changelog

> ---
>  arch/powerpc/include/asm/book3s/64/pgalloc.h | 147 
> +++
>  arch/powerpc/include/asm/nohash/64/pgalloc.h |  38 +--
>  arch/powerpc/include/asm/page.h  |  10 +-
>  arch/powerpc/mm/pgtable_64.c |   2 +-
>  4 files changed, 52 insertions(+), 145 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h 
> b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> index 37283e3d8e56..faad1319ba26 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> @@ -41,6 +41,15 @@ extern struct kmem_cache *pgtable_cache[];
>   pgtable_cache[(shift) - 1]; \
>   })
>  
> +#define PGALLOC_GFP GFP_KERNEL | __GFP_NOTRACK | __GFP_REPEAT | __GFP_ZERO
> +
> +extern pte_t *pte_fragment_alloc(struct mm_struct *, unsigned long, int);
> +extern void pte_fragment_free(unsigned long *, int);
> +extern void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift);
> +#ifdef CONFIG_SMP
> +extern void __tlb_remove_table(void *_table);
> +#endif
> +
>  static inline pgd_t *pgd_alloc(struct mm_struct *mm)
>  {
>   return kmem_cache_alloc(PGT_CACHE(PGD_INDEX_SIZE), GFP_KERNEL);
> @@ -72,29 +81,47 @@ static inline void pud_populate(struct mm_struct *mm, 
> pud_t *pud, pmd_t *pmd)
>   pud_set(pud, __pgtable_ptr_val(pmd));
>  }
>  
> +static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
> +  unsigned long address)
> +{
> +pgtable_free_tlb(tlb, pud, PUD_INDEX_SIZE);
> +}
> +
> +static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
> +{
> + return kmem_cache_alloc(PGT_CACHE(PMD_CACHE_INDEX),
> + GFP_KERNEL|__GFP_REPEAT);

PGALLOC_GFP?

> +}
> +
> +static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
> +{
> + kmem_cache_free(PGT_CACHE(PMD_CACHE_INDEX), pmd);
> +}
> +
> +static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
> +  unsigned long address)
> +{
> +return pgtable_free_tlb(tlb, pmd, PMD_CACHE_INDEX);
> +}
> +
>  static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
>  pte_t *pte)
>  {
>   pmd_set(pmd, __pgtable_ptr_val(pte));
>  }
> -/*
> - * FIXME!!
> - * Between 4K and 64K pages, we differ in what is stored in pmd. ie.
> - * typedef pte_t *pgtable_t; -> 64K
> - * typedef struct page *pgtable_t; -> 4k
> - */
> -#ifdef CONFIG_PPC_4K_PAGES
> +
>  static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
>   pgtable_t pte_page)
>  {
> - pmd_set(pmd, __pgtable_ptr_val(page_address(pte_page)));
> + pmd_set(pmd, __pgtable_ptr_val(pte_page));
>  }
>  
>  static inline pgtable_t pmd_pgtable(pmd_t pmd)
>  {
> - return pmd_page(pmd);
> + return (pgtable_t)pmd_page_vaddr(pmd);
>  }
>  
> +#ifdef CONFIG_PPC_4K_PAGES
>  static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
> unsigned long address)
>  {
> @@ -115,83 +142,10 @@ static inline pgtable_t pte_alloc_one(struct mm_struct 
> *mm,
>   __free_page(page);
>   return NULL;
>   }
> - return page;
> -}
> -
> -static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
> -{
> - free_page((unsigned long)pte);
> -}
> -
> -static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
> -{
> - pgtable_page_dtor(ptepage);
> - __free_page(ptepage);
> -}
> -
> -static inline void pgtable_free(void *table, unsigned index_size)
> -{
> - if (!index_size)
> - free_page((unsigned long)table);
> - else {
> - BUG_ON(index_size > MAX_PGTABLE_INDEX_SIZE);
> - kmem_cache_free(PGT_CACHE(index_size), table);
> - }
> -}
> -
> -#ifdef CONFIG_SMP
> -static inline void pgtable_free_tlb(struct mmu_gather *tlb,
> - void *table, int shift)
> -{
> - unsigned long pgf = (unsigned long)table;
> - BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
> - pgf |= shift;
> - tlb_remove_table(tlb, (void *)pgf);
> -}
> -
> -static inline void __tlb_remove_table(void *_table)
> -{
> - void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
> - unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
> -
> - pgtable_free(table, shift);
> -}
> -#else /* !CONFIG_SMP */
> -static inline void pgtable_free_tlb(struct mmu_gather *tlb,
> - void *table, int shift)
> -{
> - pgtable_free(table, shift);
> -}
> -#endif /* CONFIG_SMP */
> -
> -static inline void __pte_free_tlb(struct mmu_gather *tlb, 

RE: [v8, 1/7] Documentation: DT: update Freescale DCFG compatible

2016-04-25 Thread Yangbo Lu
Hi Mark,


> -Original Message-
> From: Mark Rutland [mailto:mark.rutl...@arm.com]
> Sent: Friday, April 22, 2016 9:12 PM
> To: Yangbo Lu
> Cc: linux-...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org;
> devicet...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; linux-
> ker...@vger.kernel.org; linux-...@vger.kernel.org; linux-
> i...@vger.kernel.org; io...@lists.linux-foundation.org;
> net...@vger.kernel.org; ulf.hans...@linaro.org; Scott Wood; Rob Herring;
> Russell King; Jochen Friedrich; Joerg Roedel; Claudiu Manoil; Bhupesh
> Sharma; Zhao Qiang; Kumar Gala; Santosh Shilimkar; Yang-Leo Li; Xiaobo
> Xie
> Subject: Re: [v8, 1/7] Documentation: DT: update Freescale DCFG
> compatible
> 
> On Fri, Apr 22, 2016 at 02:27:38PM +0800, Yangbo Lu wrote:
> > Update Freescale DCFG compatible with 'fsl,-dcfg' instead of
> > 'fsl,ls1021a-dcfg' to include more chips.
> >
> > Signed-off-by: Yangbo Lu 
> > ---
> > Changes for v8:
> > - Added this patch
> > ---
> >  Documentation/devicetree/bindings/arm/fsl.txt | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/Documentation/devicetree/bindings/arm/fsl.txt
> > b/Documentation/devicetree/bindings/arm/fsl.txt
> > index 752a685..1d5f512 100644
> > --- a/Documentation/devicetree/bindings/arm/fsl.txt
> > +++ b/Documentation/devicetree/bindings/arm/fsl.txt
> > @@ -119,7 +119,7 @@ Freescale DCFG
> >  configuration and status for the device. Such as setting the
> > secondary  core start address and release the secondary core from
> holdoff and startup.
> >Required properties:
> > -  - compatible: should be "fsl,ls1021a-dcfg"
> > +  - compatible: should be "fsl,-dcfg"
> 
> Please list specific values expected for , while jusy saying 
> may be more generic, it makes it practically impossible to search for the
> correct binding given a compatible string, and it's vague as to exaclty
> what  should be.

[Lu Yangbo-B47093] Thanks for your comment. I will list the possible chips.

> 
> Thanks,
> Mark.
> 
> 
> 
> >- reg : should contain base address and length of DCFG
> > memory-mapped registers
> >
> >  Example:
> > --
> > 2.1.0.27.g96db324
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe devicetree"
> > in the body of a message to majord...@vger.kernel.org More majordomo
> > info at  http://vger.kernel.org/majordomo-info.html
> >
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH kernel 2/2] powerpc/powernv/ioda2: Delay PE disposal

2016-04-25 Thread Alexey Kardashevskiy

On 04/21/2016 01:20 PM, Alexey Kardashevskiy wrote:

On 04/21/2016 10:21 AM, Gavin Shan wrote:

On Fri, Apr 08, 2016 at 04:36:44PM +1000, Alexey Kardashevskiy wrote:

When SRIOV is disabled, the existing code presumes there is no
virtual function (VF) in use and destroys all associated PEs.
However it is possible to get into the situation when the user
activated SRIOV disabling while a VF is still in use via VFIO.
For example, unbinding a physical function (PF) while there is a guest
running with a VF passed throuhgh via VFIO will trigger the bug.

This defines an IODA2-specific IOMMU group release() callback.
This moves all the disposal code from pnv_ioda_release_vf_PE() to this
new callback so the cleanup happens when the last user of an IOMMU
group released the reference.

As pnv_pci_ioda2_release_dma_pe() was reduced to just calling
iommu_group_put(), this merges pnv_pci_ioda2_release_dma_pe()
into pnv_ioda_release_vf_PE().



Sorry, I don't understand how it works. When PF's driver disables
IOV capability, the VF cannnot work. The guest is unlikely to know
that and still continue accessing the VF's resources (e.g. config
space and MMIO registers). It would cause EEH errors.


The host disables IOV which removes VF devices which unbinds vfio_pci
driver and does all the cleanup, eventually we get to QEMU's
vfio_req_notifier_handler() and PCI hot unplug is initiated and the device
disappears from the guest.

If the guest cannot do PCI hotunplug, then EEH will make host stop it anyway.

Here we do not really care what happens to the guest (it can detect EEH or
hotunplug or simply crash), we need to make sure that the _host_ does not
crash in any case because the root user did something weird.



Ping?







Signed-off-by: Alexey Kardashevskiy 
---
arch/powerpc/platforms/powernv/pci-ioda.c | 33
+--
1 file changed, 14 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c
b/arch/powerpc/platforms/powernv/pci-ioda.c
index ce9f2bf..8108c54 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1333,27 +1333,25 @@ static void pnv_pci_ioda2_set_bypass(struct
pnv_ioda_pe *pe, bool enable);
static void pnv_pci_ioda2_group_release(void *iommu_data)
{
struct iommu_table_group *table_group = iommu_data;
+struct pnv_ioda_pe *pe = container_of(table_group,
+struct pnv_ioda_pe, table_group);
+struct pci_controller *hose = pci_bus_to_host(pe->parent_dev->bus);


pe->parent_dev would be NULL for non-VF-PEs and it's protected by
CONFIG_PCI_IOV
in pci.h.



Yeah, I'll fix it.




+struct pnv_phb *phb = hose->private_data;
+struct iommu_table *tbl = pe->table_group.tables[0];
+int64_t rc;

-table_group->group = NULL;
-}
-
-static void pnv_pci_ioda2_release_dma_pe(struct pci_dev *dev, struct
pnv_ioda_pe *pe)
-{
-struct iommu_table*tbl;
-int64_t   rc;
-
-tbl = pe->table_group.tables[0];
rc = pnv_pci_ioda2_unset_window(>table_group, 0);
if (rc)
pe_warn(pe, "OPAL error %ld release DMA window\n", rc);

pnv_pci_ioda2_set_bypass(pe, false);
-if (pe->table_group.group) {
-iommu_group_put(pe->table_group.group);
-BUG_ON(pe->table_group.group);
-}
+
+BUG_ON(!tbl);
pnv_pci_ioda2_table_free_pages(tbl);
-iommu_free_table(tbl, of_node_full_name(dev->dev.of_node));
+iommu_free_table(tbl, of_node_full_name(pe->parent_dev->dev.of_node));
+
+pnv_ioda_deconfigure_pe(phb, pe);
+pnv_ioda_free_pe(phb, pe->pe_number);
}


It's not correct enough. One PE is comprised of DMA, MMIO, mapping info etc.
This function disposes all of them when DMA finishes its job. I don't figure
out a better way to represent all of them and their relationship. I guess
it's
worthy to have something in long term though it's not trival work.



Sorry, I am missing your point here. I am not changing the resource
deallocation here, I am just doing it slightly later and all I wonder at
the moment is if there are races - like having 2 scripts - one doing unbind
PF and another doing bind PF - will this crash the host in theory?






static void pnv_ioda_release_vf_PE(struct pci_dev *pdev)
@@ -1376,16 +1374,13 @@ static void pnv_ioda_release_vf_PE(struct
pci_dev *pdev)
if (pe->parent_dev != pdev)
continue;

-pnv_pci_ioda2_release_dma_pe(pdev, pe);
-
/* Remove from list */
mutex_lock(>ioda.pe_list_mutex);
list_del(>list);
mutex_unlock(>ioda.pe_list_mutex);

-pnv_ioda_deconfigure_pe(phb, pe);
-
-pnv_ioda_free_pe(phb, pe->pe_number);
+if (pe->table_group.group)
+iommu_group_put(pe->table_group.group);
}
}

--
2.5.0.rc3









--
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc: Add support for userspace P9 copy paste

2016-04-25 Thread Chris Smart

The copy paste facility introduced in POWER9 provides an optimised
mechanism for a userspace application to copy a cacheline. This is
provided by a pair of instructions, copy and paste, while a third,
cp_abort (copy paste abort), provides a clean up of the state in case of
a failure.

The copy instruction will read a 128 byte cacheline and store it in an
internal buffer. The subsequent paste instruction will store this
internal buffer to memory and set a CR field if the paste succeeds.

Since the state of the copy paste buffer is internal (and not
architecturally visible), in the unlikely event of a context switch, the
state cannot be stored and the paste should therefore fail.

The cp_abort instruction exists to fail and clean up any such
interrupted copy paste sequence and is to be called by the kernel as
part of the context switch. Doing so prevents data from a preceding copy
in one process leaking into the paste of another.

This code enables use of the cp_abort instruction if a supported
processor is detected.

NOTE: this is for userspace only, not in kernel, and does not deal
with KVM guests.

Patch created with much assistance from Michael Neuling


Signed-off-by: Chris Smart 
---

Note: A follow-up patch is expected soon with a working self-test.

arch/powerpc/include/asm/ppc-opcode.h | 2 ++
arch/powerpc/kernel/entry_64.S| 9 +
2 files changed, 11 insertions(+)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 7ab04fc59e24..1d035c1cc889 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -131,6 +131,7 @@
/* sorted alphabetically */
#define PPC_INST_BHRBE  0x7c00025c
#define PPC_INST_CLRBHRB0x7c00035c
+#define PPC_INST_CP_ABORT  0x7c00068c
#define PPC_INST_DCBA   0x7c0005ec
#define PPC_INST_DCBA_MASK  0xfc0007fe
#define PPC_INST_DCBAL  0x7c2005ec
@@ -285,6 +286,7 @@
#endif

/* Deal with instructions that older assemblers aren't aware of */
+#definePPC_CP_ABORTstringify_in_c(.long PPC_INST_CP_ABORT)
#define PPC_DCBAL(a, b) stringify_in_c(.long PPC_INST_DCBAL | \
__PPC_RA(a) | __PPC_RB(b))
#define PPC_DCBZL(a, b) stringify_in_c(.long PPC_INST_DCBZL | \
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 8b9d68676d2b..ab1457c3f1d1 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -36,6 +36,7 @@
#include 
#include 
#include 
+#include 

/*
 * System calls.
@@ -508,6 +509,14 @@ BEGIN_FTR_SECTION
ldarx   r6,0,r1
END_FTR_SECTION_IFSET(CPU_FTR_STCX_CHECKS_ADDRESS)

+BEGIN_FTR_SECTION
+/*
+ * A cp_abort (copy paste abort) here ensures that when context switching, a
+ * copy from one process can't leak into the paste of another.
+ */
+PPC_CP_ABORT
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
+
#ifdef CONFIG_PPC_BOOK3S
/* Cancel all explict user streams as they will have no use after context
 * switch and will stop the HW from creating streams itself
--
2.5.5


--
 _
°v°
/(_)\
^ ^
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V3 1/2] cpufreq: qoriq: Remove __exit macro from .exit callback

2016-04-25 Thread Rafael J. Wysocki
On Tuesday, April 19, 2016 02:41:51 PM Viresh Kumar wrote:
> On 19-04-16, 17:00, Jia Hongtao wrote:
> > .exit callback (qoriq_cpufreq_cpu_exit()) is also used during suspend.
> > So __exit macro should be removed or the function will be discarded.
> > 
> > Signed-off-by: Jia Hongtao 
> > ---
> >  drivers/cpufreq/qoriq-cpufreq.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> Acked-by: Viresh Kumar 

Applied, thanks!

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] cpufreq: qoriq: Don't show cooling device messages if THERMAL_OF undefined

2016-04-25 Thread Rafael J. Wysocki
On Monday, April 18, 2016 04:03:45 PM Viresh Kumar wrote:
> On 18-04-16, 15:59, Jia Hongtao wrote:
> > When THERMAL_OF is undefined the cooling device messages should not be
> > shown. -ENOSYS is returned from of_cpufreq_cooling_register() when
> > THERMAL_OF is undefined.
> > 
> > Signed-off-by: Jia Hongtao 
> > ---
> >  drivers/cpufreq/qoriq-cpufreq.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/cpufreq/qoriq-cpufreq.c 
> > b/drivers/cpufreq/qoriq-cpufreq.c
> > index 1c2fdc1..ff8da83 100644
> > --- a/drivers/cpufreq/qoriq-cpufreq.c
> > +++ b/drivers/cpufreq/qoriq-cpufreq.c
> > @@ -340,8 +340,8 @@ static void qoriq_cpufreq_ready(struct cpufreq_policy 
> > *policy)
> > cpud->cdev = of_cpufreq_cooling_register(np,
> >  policy->related_cpus);
> >  
> > -   if (IS_ERR(cpud->cdev)) {
> > -   pr_err("Failed to register cooling device cpu%d: %ld\n",
> > +   if (IS_ERR(cpud->cdev) && PTR_ERR(cpud->cdev) != -ENOSYS) {
> > +   pr_err("cpu%d is not running as cooling device: %ld\n",
> > policy->cpu, PTR_ERR(cpud->cdev));
> >  
> > cpud->cdev = NULL;
> 
> Acked-by: Viresh Kumar 

Applied, thanks!

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2] ftrace: Match dot symbols when searching functions on ppc64

2016-04-25 Thread Thiago Jung Bauermann
In the ppc64 big endian ABI, function symbols point to function
descriptors. The symbols which point to the function entry points
have a dot in front of the function name. Consequently, when the
ftrace filter mechanism searches for the symbol corresponding to
an entry point address, it gets the dot symbol.

As a result, ftrace filter users have to be aware of this ABI detail on
ppc64 and prepend a dot to the function name when setting the filter.

The perf probe command insulates the user from this by ignoring the dot
in front of the symbol name when matching function names to symbols,
but the sysfs interface does not. This patch makes the ftrace filter
mechanism do the same when searching symbols.

Fixes the following failure in ftracetest's kprobe_ftrace.tc:

  .../kprobe_ftrace.tc: line 9: echo: write error: Invalid argument

That failure is on this line of kprobe_ftrace.tc:

  echo _do_fork > set_ftrace_filter

This is because there's no _do_fork entry in the functions list:

  # cat available_filter_functions | grep _do_fork
  ._do_fork

This change introduces no regressions on the perf and ftracetest
testsuite results.

Cc: Steven Rostedt 
Cc: Ingo Molnar 
Cc: Michael Ellerman 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Thiago Jung Bauermann 
---

Notes:
Changes from v1 to v2:
- Use __weak mechanism instead of #ifdef.
- Return modified pointer instead of changing it in the argument.

 arch/powerpc/kernel/ftrace.c | 10 ++
 kernel/trace/ftrace.c| 12 
 2 files changed, 22 insertions(+)

diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c
index 9dac18dabd03..1123a4d8d8dd 100644
--- a/arch/powerpc/kernel/ftrace.c
+++ b/arch/powerpc/kernel/ftrace.c
@@ -607,3 +607,13 @@ unsigned long __init arch_syscall_addr(int nr)
return sys_call_table[nr*2];
 }
 #endif /* CONFIG_FTRACE_SYSCALLS && CONFIG_PPC64 */
+
+#if defined(CONFIG_PPC64) && (!defined(_CALL_ELF) || _CALL_ELF != 2)
+char *arch_ftrace_match_adjust(char *str, const char *search)
+{
+   if (str[0] == '.' && search[0] != '.')
+   return str + 1;
+   else
+   return str;
+}
+#endif /* defined(CONFIG_PPC64) && (!defined(_CALL_ELF) || _CALL_ELF != 2) */
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index b1870fbd2b67..a28322e3fed3 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3444,11 +3444,23 @@ struct ftrace_glob {
int type;
 };
 
+/*
+ * If symbols in an architecture don't correspond exactly to the user-visible
+ * name of what they represent, it is possible to define this function to
+ * perform the necessary adjustments.
+*/
+char * __weak arch_ftrace_match_adjust(char *str, const char *search)
+{
+   return str;
+}
+
 static int ftrace_match(char *str, struct ftrace_glob *g)
 {
int matched = 0;
int slen;
 
+   str = arch_ftrace_match_adjust(str, g->search);
+
switch (g->type) {
case MATCH_FULL:
if (strcmp(str, g->search) == 0)
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 1/1] ASoC: fsl_ssi: add CCSR_SSI_SOR to volatile register list

2016-04-25 Thread Caleb Crome
The CCSR_SSI_SOR is a register that clears the TX and/or the RX fifo
on the i.MX SSI port.  The fsl_ssi_trigger writes this register in
order to clear the fifo at trigger time.

However, since the CCSR_SSI_SOR register is not in the volatile list,
the caching mechanism prevented the register write in the trigger
function.  This caused the fifo to not be cleared (because the value
was unchanged from the last time the register was written), and thus
causes the channels in both TDM or simple I2S mode to slip and be in
the wrong time slots on SSI restart.

This has gone unnoticed for so long because with simple stereo mode,
the consequence is that left and right are swapped, which isn't that
noticeable.  However, it's catestrophic in some systems that
require the channels to be in the right slots.

Signed-off-by: Caleb Crome 
Suggested-by: Arnaud Mouiche 

---
 sound/soc/fsl/fsl_ssi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
index 216e3cb..2f3bf9c 100644
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -151,6 +151,7 @@ static bool fsl_ssi_volatile_reg(struct device *dev, 
unsigned int reg)
case CCSR_SSI_SACDAT:
case CCSR_SSI_SATAG:
case CCSR_SSI_SACCST:
+   case CCSR_SSI_SOR:
return true;
default:
return false;
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/1] ASoC: fsl_ssi: add CCSR_SSI_SOR to volatile register list

2016-04-25 Thread Mark Brown
On Mon, Apr 25, 2016 at 10:50:24AM -0700, Caleb Crome wrote:

> Due to caching, SOR wasn't written when it should have been.  This
> patch simply adds SOR to the volatile list.

Could you expand on when it wasn't written and why it needed to be
please?


signature.asc
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/1] ASoC: fsl_ssi: add CCSR_SSI_SOR to volatile register list

2016-04-25 Thread Caleb Crome
On Mon, Apr 25, 2016 at 11:06 AM, Mark Brown  wrote:
> On Mon, Apr 25, 2016 at 10:50:24AM -0700, Caleb Crome wrote:
>
>> Due to caching, SOR wasn't written when it should have been.  This
>> patch simply adds SOR to the volatile list.
>
> Could you expand on when it wasn't written and why it needed to be
> please?

Yes, sorry.

The CCSR_SSI_SOR is a register that clears the TX and/or the RX fifo
on the i.MX6 SSI port.  The fsl_ssi_trigger writes this register in
order to clear the fifo at trigger time.

However, since the CCSR_SSI_SOR register is not in the volatile list,
the caching mechanism prevented the register write in the trigger
function.  This caused the fifo to not be cleared (because the value
was unchanged from the last time the register was written), and thus
causes the channels in both TDM or simple I2S mode to slip and be in
the wrong time slots on SSI restart.

By adding CCSR_SSI_SOR to the volatile list, along with arnaud's
patches that I just tested (and sent tested-by slugs), fix most of the
problems  with the SSI port drivers for multi-channel operation (there
is one more to come that I think really fixes the last bit).

Most people never noticed the problem because with simple stereo mode,
the consequence is that left and right are swapped, which isn't that
noticeable.

I can re-submit the patch if you like with this more descriptive comment.

Thanks,
 -Caleb
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/1] ASoC: fsl_ssi: add CCSR_SSI_SOR to volatile register list

2016-04-25 Thread Caleb Crome
Due to caching, SOR wasn't written when it should have been.  This
patch simply adds SOR to the volatile list.

Signed-off-by: Caleb Crome 
---
 sound/soc/fsl/fsl_ssi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
index 216e3cb..2f3bf9c 100644
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -151,6 +151,7 @@ static bool fsl_ssi_volatile_reg(struct device *dev, 
unsigned int reg)
case CCSR_SSI_SACDAT:
case CCSR_SSI_SATAG:
case CCSR_SSI_SACCST:
+   case CCSR_SSI_SOR:
return true;
default:
return false;
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 06/14] dmaengine: bcm2835: DT spelling s/interrupts-names/interrupt-names/

2016-04-25 Thread Geert Uytterhoeven
Hi Vinod,

On Mon, Apr 25, 2016 at 5:26 PM, Vinod Koul  wrote:
> On Wed, Apr 20, 2016 at 05:32:11PM +0200, Geert Uytterhoeven wrote:
>> Signed-off-by: Geert Uytterhoeven 
>> ---
>>  Documentation/devicetree/bindings/sound/davinci-mcbsp.txt | 2 +-
>
> This change does not apply for me, can you please split it up and send the
> sound ones thru sound tree

Sorry, I seem to have mixed two unrelated drivers.
Will split and resend...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2] powerpc/ptrace: Fix out of bounds array access warning

2016-04-25 Thread Khem Raj
gcc-6 correctly warns about a out of bounds access

arch/powerpc/kernel/ptrace.c:407:24: warning: index 32 denotes an offset 
greater than size of 'u64[32][1] {aka long long unsigned int[32][1]}' 
[-Warray-bounds]
offsetof(struct thread_fp_state, fpr[32][0]));
^

check the end of array instead of beginning of next element to fix this

Signed-off-by: Khem Raj 
Cc: Kees Cook 
Cc: Michael Ellerman 
Cc: Segher Boessenkool 
---
Changes from v1 to v2:
- Check for fpr[32] instead of fpr[31][1]

 arch/powerpc/kernel/ptrace.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 30a03c0..060b140 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -377,7 +377,7 @@ static int fpr_get(struct task_struct *target, const struct 
user_regset *regset,
 
 #else
BUILD_BUG_ON(offsetof(struct thread_fp_state, fpscr) !=
-offsetof(struct thread_fp_state, fpr[32][0]));
+offsetof(struct thread_fp_state, fpr[32]));
 
return user_regset_copyout(, , , ,
   >thread.fp_state, 0, -1);
@@ -405,7 +405,7 @@ static int fpr_set(struct task_struct *target, const struct 
user_regset *regset,
return 0;
 #else
BUILD_BUG_ON(offsetof(struct thread_fp_state, fpscr) !=
-offsetof(struct thread_fp_state, fpr[32][0]));
+offsetof(struct thread_fp_state, fpr[32]));
 
return user_regset_copyin(, , , ,
  >thread.fp_state, 0, -1);
-- 
2.8.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V3] powerpc: Implement {cmp}xchg for u8 and u16

2016-04-25 Thread Peter Zijlstra
On Mon, Apr 25, 2016 at 06:10:51PM +0800, Pan Xinhui wrote:
> > So I'm not actually _that_ familiar with the PPC LL/SC implementation;
> > but there are things a CPU can do to optimize these loops.
> > 
> > For example, a CPU might choose to not release the exclusive hold of the
> > line for a number of cycles, except when it passes SC or an interrupt
> > happens. This way there's a smaller chance the SC fails and inhibits
> > forward progress.

> I am not sure if there is such hardware optimization.

So I think the hardware must do _something_, otherwise competing cores
doing load-exlusive could life-lock a system, each one endlessly
breaking the exclusive ownership of the other and the store-conditional
always failing.

Of course, there are such implementations, and they tend to have to put
in explicit backoff loops; however, IIRC, PPC doesn't need that. (See
ARC for an example that needs to do this.)
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 06/14] dmaengine: bcm2835: DT spelling s/interrupts-names/interrupt-names/

2016-04-25 Thread Vinod Koul
On Wed, Apr 20, 2016 at 05:32:11PM +0200, Geert Uytterhoeven wrote:
> Signed-off-by: Geert Uytterhoeven 
> ---
>  Documentation/devicetree/bindings/sound/davinci-mcbsp.txt | 2 +-

This change does not apply for me, can you please split it up and send the
sound ones thru sound tree

-- 
~Vinod

>  drivers/dma/bcm2835-dma.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/sound/davinci-mcbsp.txt 
> b/Documentation/devicetree/bindings/sound/davinci-mcbsp.txt
> index 55b53e1fd72c9d6e..e0b6165c9cfcec19 100644
> --- a/Documentation/devicetree/bindings/sound/davinci-mcbsp.txt
> +++ b/Documentation/devicetree/bindings/sound/davinci-mcbsp.txt
> @@ -43,7 +43,7 @@ mcbsp0: mcbsp@1d1 {
>   <0x0031 0x1000>;
>   reg-names = "mpu", "dat";
>   interrupts = <97 98>;
> - interrupts-names = "rx", "tx";
> + interrupt-names = "rx", "tx";
>   dmas = < 3 1
>2 1>;
>   dma-names = "tx", "rx";
> diff --git a/drivers/dma/bcm2835-dma.c b/drivers/dma/bcm2835-dma.c
> index 974015193b93cdb3..d724393e904e9a41 100644
> --- a/drivers/dma/bcm2835-dma.c
> +++ b/drivers/dma/bcm2835-dma.c
> @@ -974,7 +974,7 @@ static int bcm2835_dma_probe(struct platform_device *pdev)
>  
>   /* legacy device tree case handling */
>   dev_warn_once(>dev,
> -   "missing interrupts-names property in device tree 
> - legacy interpretation is used");
> +   "missing interrupt-names property in device tree 
> - legacy interpretation is used");
>   /*
>* in case of channel >= 11
>* use the 11th interrupt and that is shared
> -- 
> 1.9.1
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2] livepatch: Add some basic LivePatch documentation

2016-04-25 Thread Petr Mladek
LivePatch framework deserves some documentation, definitely.
This is an attempt to provide some basic info. I hope that
it will be useful for both LivePatch producers and also
potential developers of the framework itself.

Signed-off-by: Petr Mladek 
---

This version incorporates feedback from all people who
commented on v1. Thanks a lot for it.

Sometimes I copy the suggested text. Sometimes,
I used my own invention. The text has grown from 277 to
400 lines. I wish I had a lighter pen. Anyway, please
see what I hammered together.

Changes against v1:

+ switched the order of the section 4 and 5
+ tiny changes in sections 1,2,6
+ heavily updated sections 3,4,5,7

 Documentation/livepatch/livepatch.txt | 400 ++
 MAINTAINERS   |   1 +
 2 files changed, 401 insertions(+)
 create mode 100644 Documentation/livepatch/livepatch.txt

diff --git a/Documentation/livepatch/livepatch.txt 
b/Documentation/livepatch/livepatch.txt
new file mode 100644
index ..7c4777e3170c
--- /dev/null
+++ b/Documentation/livepatch/livepatch.txt
@@ -0,0 +1,400 @@
+=
+Livepatch
+=
+
+This document outlines basic information about kernel livepatching.
+
+Table of Contents:
+
+1. Motivation
+2. Kprobes, Ftrace, Livepatching
+3. Consistency model
+4. Livepatch module
+   4.1. New functions
+   4.2. Metadata
+   4.3. Livepatch module handling
+5. Livepatch life-cycle
+   5.1. Registration
+   5.2. Enabling
+   5.3. Disabling
+   5.4. Unregistration
+6. Sysfs
+7. Limitations
+
+
+1. Motivation
+=
+
+There are situations when people are really reluctant to reboot a system.
+It might be because the computer is in the middle of a complex scientific
+computation. Or the system is busy handling customer requests in the high
+season.
+
+On the other hand, people also want to keep the system stable and secure.
+This is where livepatch infrastructure comes handy. It allows selected
+function calls to be redirected to a fixed implementation without
+requiring a system reboot.
+
+
+2. Kprobes, Ftrace, Livepatching
+
+
+There are multiple mechanisms in the Linux kernel that are directly related
+to redirection of code execution; namely: kernel probes, function tracing,
+and livepatching:
+
+  + The kernel probes are the most generic way. The code can be redirected
+by putting an interrupt instruction instead of any instruction.
+
+  + The function tracer calls the code from a predefined location that is
+close the function entry. The location is generated by the compiler,
+see -pg gcc option.
+
+  + Livepatching typically needs to redirect the code at the very beginning
+of the function entry before the function parameters or the stack
+are anyhow muffled.
+
+All three approaches need to modify the existing code at runtime. Therefore
+they need to be aware of each other and do not step over each other's toes.
+Most of these problems are solved by using the dynamic ftrace framework as
+a base. A Kprobe is registered as a ftrace handler when the function entry
+is probed, see CONFIG_KPROBES_ON_FTRACE. Also an alternative function from
+a live patch is called with help of a custom ftrace handler. But there are
+some limitations, see below.
+
+
+3. Consistency model
+
+
+Functions are there for a reason. They take some input parameters, get or
+release locks, read, process, and even write some data in a defined way,
+have return values. In other words, each function has a defined semantic.
+
+Many fixes do not change the semantic of the modified functions. For
+example, they add a NULL pointer or a boundary check, fix a race by adding
+a missing memory barrier, or add some locking about a critical section.
+Most of these changes are self contained and the function present itself
+the same way to the rest of the system. In this case, the functions might
+be updated independently one by one.
+
+But there are more complex fixes. For example, a patch might change
+ordering of locking in more functions at the same time. Or a patch
+might exchange meaning of some temporary structures and update
+all the relevant functions. In this case, the affected unit
+(thread, whole kernel) need to start using all new versions of
+the functions at the same time. Also the switch must happen only
+when it is safe to do so, e.g. when the affected locks are released
+or no data are stored in the modified structures at the moment.
+
+The theory about how to apply functions a safe way is rather complex.
+The aim is to define a so-called consistency model. It means to define
+conditions when the new implementation could be used so that the system
+stays consistent. The theory is not yet finished. See the discussion at
+http://thread.gmane.org/gmane.linux.kernel/1823033/focus=1828189
+
+The current consistency model is very simple. It guarantees that either
+the old or the new function is called. But 

Re: [PATCH 12/14] phy: phy-stih41x-usb: DT spelling s/#phy-cell/#phy-cells/

2016-04-25 Thread Rob Herring
On Wed, Apr 20, 2016 at 05:32:17PM +0200, Geert Uytterhoeven wrote:
> Signed-off-by: Geert Uytterhoeven 

Applied, thanks.

Rob

> ---
>  Documentation/devicetree/bindings/phy/phy-stih41x-usb.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/phy/phy-stih41x-usb.txt 
> b/Documentation/devicetree/bindings/phy/phy-stih41x-usb.txt
> index 00944a05ee6b4998..744b4809542edd3b 100644
> --- a/Documentation/devicetree/bindings/phy/phy-stih41x-usb.txt
> +++ b/Documentation/devicetree/bindings/phy/phy-stih41x-usb.txt
> @@ -17,7 +17,7 @@ Example:
>  
>  usb2_phy: usb2phy@0 {
>   compatible  = "st,stih416-usb-phy";
> - #phy-cell   = <0>;
> + #phy-cells  = <0>;
>   st,syscfg   = <_rear>;
>   clocks  = <_sysin>;
>   clock-names = "osc_phy";
> -- 
> 1.9.1
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 11/14] PCI: hisi: DT spelling s/interrupts-*/interrupt-*/

2016-04-25 Thread Rob Herring
On Wed, Apr 20, 2016 at 05:32:16PM +0200, Geert Uytterhoeven wrote:
> Signed-off-by: Geert Uytterhoeven 

Applied, thanks.

Rob

> ---
>  Documentation/devicetree/bindings/pci/hisilicon-pcie.txt | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt 
> b/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt
> index b721beacfe4dae6c..59c2f47aa303ae24 100644
> --- a/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt
> +++ b/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt
> @@ -34,11 +34,11 @@ Hip05 Example (note that Hip06 is the same except 
> compatible):
>   ranges = <0x8200 0 0x 0x220 0x 0 
> 0x1000>;
>   num-lanes = <8>;
>   port-id = <1>;
> - #interrupts-cells = <1>;
> - interrupts-map-mask = <0xf800 0 0 7>;
> - interrupts-map = <0x0 0 0 1 _pcie 1 10
> -   0x0 0 0 2 _pcie 2 11
> -   0x0 0 0 3 _pcie 3 12
> -   0x0 0 0 4 _pcie 4 13>;
> + #interrupt-cells = <1>;
> + interrupt-map-mask = <0xf800 0 0 7>;
> + interrupt-map = <0x0 0 0 1 _pcie 1 10
> +  0x0 0 0 2 _pcie 2 11
> +  0x0 0 0 3 _pcie 3 12
> +  0x0 0 0 4 _pcie 4 13>;
>   status = "ok";
>   };
> -- 
> 1.9.1
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 10/14] misc: sram: DT spelling s/#adress-cells/#address-cells/

2016-04-25 Thread Rob Herring
On Wed, Apr 20, 2016 at 05:32:15PM +0200, Geert Uytterhoeven wrote:
> Signed-off-by: Geert Uytterhoeven 

Applied, thanks.

Rob

> ---
>  Documentation/devicetree/bindings/sram/sram.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/sram/sram.txt 
> b/Documentation/devicetree/bindings/sram/sram.txt
> index 227e3a341af1e2b5..add48f09015e212e 100644
> --- a/Documentation/devicetree/bindings/sram/sram.txt
> +++ b/Documentation/devicetree/bindings/sram/sram.txt
> @@ -51,7 +51,7 @@ sram: sram@5c00 {
>   compatible = "mmio-sram";
>   reg = <0x5c00 0x4>; /* 256 KiB SRAM at address 0x5c00 */
>  
> - #adress-cells = <1>;
> + #address-cells = <1>;
>   #size-cells = <1>;
>   ranges = <0 0x5c00 0x4>;
>  
> -- 
> 1.9.1
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc: Fix definition of SIAR register

2016-04-25 Thread Madhavan Srinivasan


On Monday 25 April 2016 01:45 PM, Alexander Graf wrote:
>
>> Am 25.04.2016 um 10:08 schrieb Madhavan Srinivasan 
>> :
>>
>>
>>
>>> On Friday 08 April 2016 09:24 PM, Thomas Huth wrote:
>>> The SIAR register is available twice, one time as SPR 780 (unprivileged,
>>> but read-only), and one time as SPR 796 (privileged, but read and write).
>>> The Linux kernel code currently uses SPR 780 - and while this is OK for
>>> reading, writing to that register of course does not work.
>>> Since the KVM code tries to write to this register, too (see the mtspr
>>> in book3s_hv_rmhandlers.S), the contents of this register sometimes get
>>> lost for the guests, e.g. during migration of a VM.
>>> To fix this issue, simply switch to the other SPR numer 796 instead.
>> IIUC, SIAR and SDAR are updated by hardware when we take
>> a pmu exception with sampling mode enabled (based on instr).
>> And these register contents are mainly for OS consumption.
>> So, we dont need to restore these register values at all,
>> kindly correct me if I missing something here.
> What if you migrate between a pmu event firing and the os reading siar? Or 
> what if the host gets pmu events? Or we migrate the guest to a different 
> pcpu? In all those cases we need to ensure the register contents are 
> consistent.

Ok got it. Let me try perf record with sample_addr type.

Maddy
>
>> Maddy
>>
>>> Signed-off-by: Thomas Huth 
>>> ---
>>> Note: The perf code in core-book3s.c also seems to write to the SIAR
>>>   SPR, so that might be affected by this issue, too - but I did
>>>   not test the perf code, so I'm not sure about that part.
> Please write a small unit test that fires off pmu events constantly and 
> checks whtether they arrive correctly. Run perf in parallel on the host to 
> increase the chance for breakage.
>
>>> arch/powerpc/include/asm/reg.h | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
>>> index f5f4c66..6630420 100644
>>> --- a/arch/powerpc/include/asm/reg.h
>>> +++ b/arch/powerpc/include/asm/reg.h
>>> @@ -752,13 +752,13 @@
>>> #define SPRN_PMC6792
>>> #define SPRN_PMC7793
>>> #define SPRN_PMC8794
>>> -#define SPRN_SIAR780
>>> #define SPRN_SDAR781
>>> #define SPRN_SIER784
>>> #define   SIER_SIPR0x200/* Sampled MSR_PR */
>>> #define   SIER_SIHV0x100/* Sampled MSR_HV */
>>> #define   SIER_SIAR_VALID0x040/* SIAR contents valid */
>>> #define   SIER_SDAR_VALID0x020/* SDAR contents valid */
>>> +#define SPRN_SIAR796
> I'm sure there's a reason (iSeries?) we used the r/o version before. Better 
> introduce a new constant that gives us rw access and use that in the kvm 
> entry/exit code.
>
> Alex
>
>>> #define SPRN_TACR888
>>> #define SPRN_TCSCR889
>>> #define SPRN_CSIGR890
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V3] powerpc: Implement {cmp}xchg for u8 and u16

2016-04-25 Thread Pan Xinhui

On 2016年04月22日 00:13, Peter Zijlstra wrote:
> On Thu, Apr 21, 2016 at 11:35:07PM +0800, Pan Xinhui wrote:
>> yes, you are right. more load/store will be done in C code.
>> However such xchg_u8/u16 is just used by qspinlock now. and I did not see 
>> any performance regression.
>> So just wrote in C, for simple. :)
> 
> Which is fine; but worthy of a note in your Changelog.
> 
will do that.

>> Of course I have done xchg tests.
>> we run code just like xchg((u8*), j++); in several threads.
>> and the result is,
>> [  768.374264] use time[1550072]ns in xchg_u8_asm
>> [  768.377102] use time[2826802]ns in xchg_u8_c
>>
>> I think this is because there is one more load in C.
>> If possible, we can move such code in asm-generic/.
> 
> So I'm not actually _that_ familiar with the PPC LL/SC implementation;
> but there are things a CPU can do to optimize these loops.
> 
> For example, a CPU might choose to not release the exclusive hold of the
> line for a number of cycles, except when it passes SC or an interrupt
> happens. This way there's a smaller chance the SC fails and inhibits
> forward progress.
I am not sure if there is such hardware optimization.

> 
> By doing the modification outside of the LL/SC you loose such
> advantages.
> 
> And yes, doing a !exclusive load prior to the exclusive load leads to an
> even bigger window where the data can get changed out from under you.
> 
you are right.
We have observed such data change during the two different loads.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC v6 00/10] vfio-pci: Allow to mmap sub-page MMIO BARs and MSI-X table

2016-04-25 Thread Yongji Xie

Hi Alex,

Any comment?

Thanks,
Yongji

On 2016/4/18 18:53, Yongji Xie wrote:

Current vfio-pci implementation disallows to mmap
sub-page(size < PAGE_SIZE) MMIO BARs and MSI-X table. This is because
sub-page BARs' mmio page may be shared with other BARs and MSI-X table
should not be accessed directly from the guest for security reasons.

But it will easily cause some performance issues for mmio accesses
in guest when vfio passthrough sub-page BARs or BARs containing MSI-X
table on PPC64 platform. This is because PAGE_SIZE is 64KB by default
on PPC64 platform and the big page may easily hit the sub-page MMIO
BARs' unmmapping and cause the unmmaping of the mmio page which
MSI-X table locate in, which lead to mmio emulation in host.

For sub-page MMIO BARs' unmmapping, this patchset modifies
resource_alignment kernel parameter to enforce the alignment of all
MMIO BARs to be at least PAGE_SZIE so that sub-page BAR's mmio page
will not be shared with other BARs. And we also add shadow resources
to the vfio device and put them into the holes of mmio pages in case
that hot-add device's BARs are assigned into the holes. Then we can
mmap sub-page MMIO BARs safely.

For MSI-X table's unmmapping, we think MSI-X table is safe to access
directly from userspace if hardware supports the capability of
interrupt remapping which can ensure that a given pci device can
only shoot the MSIs assigned for it. But the implenmentation of
this capability is arch-independent. To have a universal way
to test this capability on PCI side for different archs, we introduce
a new bus_flags PCI_BUS_FLAGS_MSI_REMAP.

With this patchset applied, we can get almost 100% improvement on
performance for small block 4k random read when we passthrough a FC
HBA containing sub-page BARs and MSI-X BARs to guest on PPC64 in
our test.

The patch 8 are based on the proposed patchset[2].

Changelog v6:
- Rebase on vfio/next with patchset[2] applied
- Fix some bugs of v5
- Add three patches to make PCI_BUS_FLAGS_MSI_REMAP as
   a universal flag to test IRQ remapping

Changelog v5:
- Rebase on vfio/next
- Change the order of patch 1,2,3
- Move the warning "resource_alignment will not work with
   PCI_PROBE_ONLY set" from documentation to kernel log
- Remove IORESOURCE_WINDOW
- Add description for parameter "resize"
- Add PCIBIOS_MIN_ALIGNMENT to force all MMIO BARs to
   get minimum alignment
- Add shadow resources to make sure sub-page BAR's mmio
   page will not be shared with hot-add BARs.
- Add a new bit to pci_bus_flags to indicate the capbility
   of interrupt remapping on PPC64
- Remove IOMMU_CAP_INTR_REMAP on PPC64
- Add a property msi_remap to vfio_pci_device to cache the
   capbility of interrupt remapping

Changelog v4:
- Rebase on v4.5-rc6 with patchset[1] applied.
- Remove resource_page_aligned kernel parameter
- Fix some problems with resource_alignment kernel parameter
- Modify resource_alignment kernel parameter to support multiple
   devices.
- Remove host bridge attribute: msi_filtered
- Use IOMMU_CAP_INTR_REMAP to check if MSI-X table can be mmapped
- Add IOMMU_CAP_INTR_REMAP for IODA host bridge on PPC64 platform

Changelog v3:
- Rebase on new linux kernel mainline with the patchset[1] applied.
- Add a function to check whether PCI BARs'mmio page is shared with
   other BARs.
- Add a host bridge attribute to indicate PCI host bridge support
   filtering of MSIs.
- Use the new host bridge attribute to check if MSI-X table can
   be mmapped instead of CONFIG_EEH.
- Remove Kconfig option VFIO_PCI_MMAP_MSIX

Changelog v2:
- Rebase on v4.4-rc6 with the patchset[1] applied.
- Use kernel parameter to enforce all MMIO BARs to be page aligned
   on PCI core code instead of doing it on PPC64 arch code.
- Remove flags: VFIO_DEVICE_FLAGS_PCI_PAGE_ALIGNED

[1] http://www.spinics.net/lists/kvm/msg127812.html
[2] http://www.spinics.net/lists/kvm/msg130256.html

Yongji Xie (10):
   PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set
   PCI: Do not Use IORESOURCE_STARTALIGN to identify bridge resources
   PCI: Add a new option for resource_alignment to reassign alignment
   PCI: Add support for enforcing all MMIO BARs to be page aligned
   vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive
   PCI: Add a new PCI_BUS_FLAGS_MSI_REMAP flag
   iommu: Set PCI_BUS_FLAGS_MSI_REMAP if IOMMU have capability of IRQ remapping
   PCI: Set PCI_BUS_FLAGS_MSI_REMAP if MSI controller supports IRQ remapping
   pci-ioda: Set PCI_BUS_FLAGS_MSI_REMAP for IODA host bridge
   vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

  Documentation/kernel-parameters.txt   |7 +-
  arch/powerpc/include/asm/pci.h|2 +
  arch/powerpc/platforms/powernv/pci-ioda.c |8 +++
  drivers/iommu/iommu.c |   15 +
  drivers/pci/msi.c |   12 
  drivers/pci/pci.c |  105 +++--
  drivers/pci/probe.c   |3 +
 

Re: [PATCH] powerpc: Fix definition of SIAR register

2016-04-25 Thread Alexander Graf

On 04/25/2016 11:16 AM, Thomas Huth wrote:

On 25.04.2016 10:15, Alexander Graf wrote:



Am 25.04.2016 um 10:08 schrieb Madhavan Srinivasan :




On Friday 08 April 2016 09:24 PM, Thomas Huth wrote:
The SIAR register is available twice, one time as SPR 780 (unprivileged,
but read-only), and one time as SPR 796 (privileged, but read and write).
The Linux kernel code currently uses SPR 780 - and while this is OK for
reading, writing to that register of course does not work.
Since the KVM code tries to write to this register, too (see the mtspr
in book3s_hv_rmhandlers.S), the contents of this register sometimes get
lost for the guests, e.g. during migration of a VM.
To fix this issue, simply switch to the other SPR numer 796 instead.

IIUC, SIAR and SDAR are updated by hardware when we take
a pmu exception with sampling mode enabled (based on instr).
And these register contents are mainly for OS consumption.
So, we dont need to restore these register values at all,
kindly correct me if I missing something here.

What if you migrate between a pmu event firing and the os reading siar? Or what 
if the host gets pmu events? Or we migrate the guest to a different pcpu? In 
all those cases we need to ensure the register contents are consistent.

Right. Or a guest could use the SIAR as a temporary scratch register
while not using the performance monitoring stuff. In that case the
contents of the register of course have to be preserved, too.


Signed-off-by: Thomas Huth 
---
Note: The perf code in core-book3s.c also seems to write to the SIAR
   SPR, so that might be affected by this issue, too - but I did
   not test the perf code, so I'm not sure about that part.

Please write a small unit test that fires off pmu events constantly and checks 
whtether they arrive correctly. Run perf in parallel on the host to increase 
the chance for breakage.

I'm not very familiar with that PMU stuff yet, but I can have a try...


arch/powerpc/include/asm/reg.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index f5f4c66..6630420 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -752,13 +752,13 @@
#define SPRN_PMC6792
#define SPRN_PMC7793
#define SPRN_PMC8794
-#define SPRN_SIAR780
#define SPRN_SDAR781
#define SPRN_SIER784
#define   SIER_SIPR0x200/* Sampled MSR_PR */
#define   SIER_SIHV0x100/* Sampled MSR_HV */
#define   SIER_SIAR_VALID0x040/* SIAR contents valid */
#define   SIER_SDAR_VALID0x020/* SDAR contents valid */
+#define SPRN_SIAR796

I'm sure there's a reason (iSeries?) we used the r/o version before. Better 
introduce a new constant that gives us rw access and use that in the kvm 
entry/exit code.

Sure. Any suggestions on the naming? I could either rename the current
SPRN_SIAR to SPRN_USIAR (so that it is named similar to other registers
that behave that way, like SPRN_USPRG3 - and also QEMU uses USIAR for
this already). Or I could leave the old name untouched and use something
like "SPRN_SIAR_WR" for the 796 register. What do you prefer?


I'd defer that decision to Michael :).


By the way, I just noticed that SPRN_SDAR (781) seems to suffer from the
same problem, too!


Great! The more the merrier :)


Alex

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc: Fix definition of SIAR register

2016-04-25 Thread Thomas Huth
On 25.04.2016 10:15, Alexander Graf wrote:
> 
> 
>> Am 25.04.2016 um 10:08 schrieb Madhavan Srinivasan 
>> :
>>
>>
>>
>>> On Friday 08 April 2016 09:24 PM, Thomas Huth wrote:
>>> The SIAR register is available twice, one time as SPR 780 (unprivileged,
>>> but read-only), and one time as SPR 796 (privileged, but read and write).
>>> The Linux kernel code currently uses SPR 780 - and while this is OK for
>>> reading, writing to that register of course does not work.
>>> Since the KVM code tries to write to this register, too (see the mtspr
>>> in book3s_hv_rmhandlers.S), the contents of this register sometimes get
>>> lost for the guests, e.g. during migration of a VM.
>>> To fix this issue, simply switch to the other SPR numer 796 instead.
>>
>> IIUC, SIAR and SDAR are updated by hardware when we take
>> a pmu exception with sampling mode enabled (based on instr).
>> And these register contents are mainly for OS consumption.
>> So, we dont need to restore these register values at all,
>> kindly correct me if I missing something here.
> 
> What if you migrate between a pmu event firing and the os reading siar? Or 
> what if the host gets pmu events? Or we migrate the guest to a different 
> pcpu? In all those cases we need to ensure the register contents are 
> consistent.

Right. Or a guest could use the SIAR as a temporary scratch register
while not using the performance monitoring stuff. In that case the
contents of the register of course have to be preserved, too.

>>> Signed-off-by: Thomas Huth 
>>> ---
>>> Note: The perf code in core-book3s.c also seems to write to the SIAR
>>>   SPR, so that might be affected by this issue, too - but I did
>>>   not test the perf code, so I'm not sure about that part.
> 
> Please write a small unit test that fires off pmu events constantly and 
> checks whtether they arrive correctly. Run perf in parallel on the host to 
> increase the chance for breakage.

I'm not very familiar with that PMU stuff yet, but I can have a try...

>>> arch/powerpc/include/asm/reg.h | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
>>> index f5f4c66..6630420 100644
>>> --- a/arch/powerpc/include/asm/reg.h
>>> +++ b/arch/powerpc/include/asm/reg.h
>>> @@ -752,13 +752,13 @@
>>> #define SPRN_PMC6792
>>> #define SPRN_PMC7793
>>> #define SPRN_PMC8794
>>> -#define SPRN_SIAR780
>>> #define SPRN_SDAR781
>>> #define SPRN_SIER784
>>> #define   SIER_SIPR0x200/* Sampled MSR_PR */
>>> #define   SIER_SIHV0x100/* Sampled MSR_HV */
>>> #define   SIER_SIAR_VALID0x040/* SIAR contents valid */
>>> #define   SIER_SDAR_VALID0x020/* SDAR contents valid */
>>> +#define SPRN_SIAR796
> 
> I'm sure there's a reason (iSeries?) we used the r/o version before. Better 
> introduce a new constant that gives us rw access and use that in the kvm 
> entry/exit code.

Sure. Any suggestions on the naming? I could either rename the current
SPRN_SIAR to SPRN_USIAR (so that it is named similar to other registers
that behave that way, like SPRN_USPRG3 - and also QEMU uses USIAR for
this already). Or I could leave the old name untouched and use something
like "SPRN_SIAR_WR" for the 796 register. What do you prefer?

By the way, I just noticed that SPRN_SDAR (781) seems to suffer from the
same problem, too!

 Thomas

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/ptrace: Fix out of bounds array access warning

2016-04-25 Thread Segher Boessenkool
On Sun, Apr 24, 2016 at 11:00:06PM -0700, Khem Raj wrote:
> gcc-6 correctly warns about a out of bounds access
> 
> arch/powerpc/kernel/ptrace.c:407:24: warning: index 32 denotes an offset 
> greater than size of 'u64[32][1] {aka long long unsigned int[32][1]}' 
> [-Warray-bounds]
> offsetof(struct thread_fp_state, fpr[32][0]));
> ^
> 
> check the end of array instead of beginning of next element to fix this

This should be fixed by doing

> offsetof(struct thread_fp_state, fpr[32]));

instead; [31][1] is not the correct offset when TS_FPRWIDTH > 1.


Segher
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc: Fix definition of SIAR register

2016-04-25 Thread Alexander Graf


> Am 25.04.2016 um 10:08 schrieb Madhavan Srinivasan :
> 
> 
> 
>> On Friday 08 April 2016 09:24 PM, Thomas Huth wrote:
>> The SIAR register is available twice, one time as SPR 780 (unprivileged,
>> but read-only), and one time as SPR 796 (privileged, but read and write).
>> The Linux kernel code currently uses SPR 780 - and while this is OK for
>> reading, writing to that register of course does not work.
>> Since the KVM code tries to write to this register, too (see the mtspr
>> in book3s_hv_rmhandlers.S), the contents of this register sometimes get
>> lost for the guests, e.g. during migration of a VM.
>> To fix this issue, simply switch to the other SPR numer 796 instead.
> 
> IIUC, SIAR and SDAR are updated by hardware when we take
> a pmu exception with sampling mode enabled (based on instr).
> And these register contents are mainly for OS consumption.
> So, we dont need to restore these register values at all,
> kindly correct me if I missing something here.

What if you migrate between a pmu event firing and the os reading siar? Or what 
if the host gets pmu events? Or we migrate the guest to a different pcpu? In 
all those cases we need to ensure the register contents are consistent.

> 
> Maddy
> 
>> 
>> Signed-off-by: Thomas Huth 
>> ---
>> Note: The perf code in core-book3s.c also seems to write to the SIAR
>>   SPR, so that might be affected by this issue, too - but I did
>>   not test the perf code, so I'm not sure about that part.

Please write a small unit test that fires off pmu events constantly and checks 
whtether they arrive correctly. Run perf in parallel on the host to increase 
the chance for breakage.

>> 
>> arch/powerpc/include/asm/reg.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
>> index f5f4c66..6630420 100644
>> --- a/arch/powerpc/include/asm/reg.h
>> +++ b/arch/powerpc/include/asm/reg.h
>> @@ -752,13 +752,13 @@
>> #define SPRN_PMC6792
>> #define SPRN_PMC7793
>> #define SPRN_PMC8794
>> -#define SPRN_SIAR780
>> #define SPRN_SDAR781
>> #define SPRN_SIER784
>> #define   SIER_SIPR0x200/* Sampled MSR_PR */
>> #define   SIER_SIHV0x100/* Sampled MSR_HV */
>> #define   SIER_SIAR_VALID0x040/* SIAR contents valid */
>> #define   SIER_SDAR_VALID0x020/* SDAR contents valid */
>> +#define SPRN_SIAR796

I'm sure there's a reason (iSeries?) we used the r/o version before. Better 
introduce a new constant that gives us rw access and use that in the kvm 
entry/exit code.

Alex

>> #define SPRN_TACR888
>> #define SPRN_TCSCR889
>> #define SPRN_CSIGR890
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc: Fix definition of SIAR register

2016-04-25 Thread Madhavan Srinivasan


On Friday 08 April 2016 09:24 PM, Thomas Huth wrote:
> The SIAR register is available twice, one time as SPR 780 (unprivileged,
> but read-only), and one time as SPR 796 (privileged, but read and write).
> The Linux kernel code currently uses SPR 780 - and while this is OK for
> reading, writing to that register of course does not work.
> Since the KVM code tries to write to this register, too (see the mtspr
> in book3s_hv_rmhandlers.S), the contents of this register sometimes get
> lost for the guests, e.g. during migration of a VM.
> To fix this issue, simply switch to the other SPR numer 796 instead.

IIUC, SIAR and SDAR are updated by hardware when we take
a pmu exception with sampling mode enabled (based on instr).
And these register contents are mainly for OS consumption.
So, we dont need to restore these register values at all,
kindly correct me if I missing something here.

Maddy

>
> Signed-off-by: Thomas Huth 
> ---
>  Note: The perf code in core-book3s.c also seems to write to the SIAR
>SPR, so that might be affected by this issue, too - but I did
>not test the perf code, so I'm not sure about that part.
>
>  arch/powerpc/include/asm/reg.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index f5f4c66..6630420 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -752,13 +752,13 @@
>  #define SPRN_PMC6792
>  #define SPRN_PMC7793
>  #define SPRN_PMC8794
> -#define SPRN_SIAR780
>  #define SPRN_SDAR781
>  #define SPRN_SIER784
>  #define   SIER_SIPR  0x200   /* Sampled MSR_PR */
>  #define   SIER_SIHV  0x100   /* Sampled MSR_HV */
>  #define   SIER_SIAR_VALID0x040   /* SIAR contents valid */
>  #define   SIER_SDAR_VALID0x020   /* SDAR contents valid */
> +#define SPRN_SIAR796
>  #define SPRN_TACR888
>  #define SPRN_TCSCR   889
>  #define SPRN_CSIGR   890

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc: Fix definition of SIAR register

2016-04-25 Thread Thomas Huth
On 08.04.2016 17:54, Thomas Huth wrote:
> The SIAR register is available twice, one time as SPR 780 (unprivileged,
> but read-only), and one time as SPR 796 (privileged, but read and write).
> The Linux kernel code currently uses SPR 780 - and while this is OK for
> reading, writing to that register of course does not work.
> Since the KVM code tries to write to this register, too (see the mtspr
> in book3s_hv_rmhandlers.S), the contents of this register sometimes get
> lost for the guests, e.g. during migration of a VM.
> To fix this issue, simply switch to the other SPR numer 796 instead.
> 
> Signed-off-by: Thomas Huth 
> ---
>  Note: The perf code in core-book3s.c also seems to write to the SIAR
>SPR, so that might be affected by this issue, too - but I did
>not test the perf code, so I'm not sure about that part.
> 
>  arch/powerpc/include/asm/reg.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index f5f4c66..6630420 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -752,13 +752,13 @@
>  #define SPRN_PMC6792
>  #define SPRN_PMC7793
>  #define SPRN_PMC8794
> -#define SPRN_SIAR780
>  #define SPRN_SDAR781
>  #define SPRN_SIER784
>  #define   SIER_SIPR  0x200   /* Sampled MSR_PR */
>  #define   SIER_SIHV  0x100   /* Sampled MSR_HV */
>  #define   SIER_SIAR_VALID0x040   /* SIAR contents valid */
>  #define   SIER_SDAR_VALID0x020   /* SDAR contents valid */
> +#define SPRN_SIAR796
>  #define SPRN_TACR888
>  #define SPRN_TCSCR   889
>  #define SPRN_CSIGR   890

Ping!

Anybody any comments?

 Thomas

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/ptrace: Fix out of bounds array access warning

2016-04-25 Thread Khem Raj
gcc-6 correctly warns about a out of bounds access

arch/powerpc/kernel/ptrace.c:407:24: warning: index 32 denotes an offset 
greater than size of 'u64[32][1] {aka long long unsigned int[32][1]}' 
[-Warray-bounds]
offsetof(struct thread_fp_state, fpr[32][0]));
^

check the end of array instead of beginning of next element to fix this

Signed-off-by: Khem Raj 
Cc: Kees Cook 
Cc: Michael Ellerman 
---
 arch/powerpc/kernel/ptrace.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 30a03c0..269f80f 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -377,7 +377,7 @@ static int fpr_get(struct task_struct *target, const struct 
user_regset *regset,
 
 #else
BUILD_BUG_ON(offsetof(struct thread_fp_state, fpscr) !=
-offsetof(struct thread_fp_state, fpr[32][0]));
+offsetof(struct thread_fp_state, fpr[31][1]));
 
return user_regset_copyout(, , , ,
   >thread.fp_state, 0, -1);
@@ -405,7 +405,7 @@ static int fpr_set(struct task_struct *target, const struct 
user_regset *regset,
return 0;
 #else
BUILD_BUG_ON(offsetof(struct thread_fp_state, fpscr) !=
-offsetof(struct thread_fp_state, fpr[32][0]));
+offsetof(struct thread_fp_state, fpr[31][1]));
 
return user_regset_copyin(, , , ,
  >thread.fp_state, 0, -1);
-- 
2.8.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev