[PATCH] powerpc/mm/radix: Add missing tlb flush

2016-05-10 Thread Aneesh Kumar K.V
This should not have any impact on hash, because hash does tlb
invalidate with every pte update and we don't implement
flush_tlb_* functions for hash. With radix we should make an explicit
call to flush tlb outside pte update.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/pgtable-book3s64.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/powerpc/mm/pgtable-book3s64.c 
b/arch/powerpc/mm/pgtable-book3s64.c
index eb4451144746..670318766545 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -33,10 +33,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, 
unsigned long address,
changed = !pmd_same(*(pmdp), entry);
if (changed) {
__ptep_set_access_flags(pmdp_ptep(pmdp), pmd_pte(entry));
-   /*
-* Since we are not supporting SW TLB systems, we don't
-* have any thing similar to flush_tlb_page_nohash()
-*/
+   flush_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
}
return changed;
 }
-- 
2.7.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 1/2] powerpc/timer - large decrementer support

2016-05-10 Thread Balbir Singh


On 10/05/16 14:57, Oliver O'Halloran wrote:
> POWER ISA v3 adds large decrementer (LD) mode of operation which increases
> the size of the decrementer register from 32 bits to an implementation
> defined with of up to 64 bits.
> 
> This patch adds support for the LD on processors with the CPU_FTR_ARCH_300
> cpu feature flag set. For CPUs with this feature LD mode is enabled when
> when the ibm,dec-bits devicetree property is supplied for the boot CPU. The
> decrementer value is a signed quantity (with negative values indicating a
> pending exception) and this property is required to find the maximum
> positive decrementer value. If this property is not supplied then the
> traditional decrementer width of 32 bits is assumed and LD mode is disabled.
> 
> This patch was based on initial work by Jack Miller.
> 
> Signed-off-by: Oliver O'Halloran 
> Cc: Michael Neuling 
> Cc: Balbir Singh 
> Cc: Jack Miller 

These bits look good mostly

Reviewed-by: Balbir Singh 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/tm: Clean up duplication of code

2016-05-10 Thread Balbir Singh


On 11/05/16 14:55, Rashmica Gupta wrote:
> The same logic for tm_abort appears twice, so pull it out into a
> function.
> 
> Signed-off-by: Rashmica Gupta 
> ---
>  arch/powerpc/mm/hash_utils_64.c | 47 
> ++---
>  1 file changed, 21 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index 7635b1c6b5da..1cef8f5aee9b 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -1318,6 +1318,25 @@ out_exit:
>   local_irq_restore(flags);
>  }
>  
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> + /* Transactions are not aborted by tlbiel, only tlbie.
> +  * Without, syncing a page back to a block device w/ PIO could pick up
> +  * transactional data (bad!) so we force an abort here.  Before the
> +  * sync the page will be made read-only, which will flush_hash_page.
> +  * BIG ISSUE here: if the kernel uses a page from userspace without
> +  * unmapping it first, it may see the speculated version.
> +  */
> +static inline void abort_tm(int local)
> +{
> + if (local && cpu_has_feature(CPU_FTR_TM) &&
> + current->thread.regs &&
> + MSR_TM_ACTIVE(current->thread.regs->msr)) {
> + tm_enable();
> + tm_abort(TM_CAUSE_TLBI);
> + }
> +}

While your at this do

#else

static inline void abort_tm(int local)
{
}

> +#endif
> +
>  /* WARNING: This is called from hash_low_64.S, if you change this prototype,
>   *  do not forget to update the assembly call site !
>   */
> @@ -1344,19 +1363,7 @@ void flush_hash_page(unsigned long vpn, real_pte_t 
> pte, int psize, int ssize,
>   } pte_iterate_hashed_end();
>  
>  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
Then remove these extra #ifdef
> - /* Transactions are not aborted by tlbiel, only tlbie.
> -  * Without, syncing a page back to a block device w/ PIO could pick up
> -  * transactional data (bad!) so we force an abort here.  Before the
> -  * sync the page will be made read-only, which will flush_hash_page.
> -  * BIG ISSUE here: if the kernel uses a page from userspace without
> -  * unmapping it first, it may see the speculated version.
> -  */
> - if (local && cpu_has_feature(CPU_FTR_TM) &&
> - current->thread.regs &&
> - MSR_TM_ACTIVE(current->thread.regs->msr)) {
> - tm_enable();
> - tm_abort(TM_CAUSE_TLBI);
> - }
> + abort_tm(local);
>  #endif
>  }
>  
> @@ -1415,19 +1422,7 @@ void flush_hash_hugepage(unsigned long vsid, unsigned 
> long addr,
>   }
>  tm_abort:
>  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM

Then remove these extra #ifdef
> - /* Transactions are not aborted by tlbiel, only tlbie.
> -  * Without, syncing a page back to a block device w/ PIO could pick up
> -  * transactional data (bad!) so we force an abort here.  Before the
> -  * sync the page will be made read-only, which will flush_hash_page.
> -  * BIG ISSUE here: if the kernel uses a page from userspace without
> -  * unmapping it first, it may see the speculated version.
> -  */
> - if (local && cpu_has_feature(CPU_FTR_TM) &&
> - current->thread.regs &&
> - MSR_TM_ACTIVE(current->thread.regs->msr)) {
> - tm_enable();
> - tm_abort(TM_CAUSE_TLBI);
> - }
> + abort_tm(local);
>  #endif
>   return;
>  }
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/kvm: Fix build error on book3s_hv.c

2016-05-10 Thread Balbir Singh


On 11/05/16 11:15, Gavin Shan wrote:
> When CONFIG_KVM_XICS is enabled, CPU_UP_PREPARE and other macros for
> CPU states in linux/cpu.h are needed by arch/powerpc/kvm/book3s_hv.c.
> Otherwise, build error as below is seen:
> 
>gwshan@gwshan:~/sandbox/l$ make arch/powerpc/kvm/book3s_hv.o
> :
>CC  arch/powerpc/kvm/book3s_hv.o
>arch/powerpc/kvm/book3s_hv.c: In function ‘kvmppc_cpu_notify’:
>arch/powerpc/kvm/book3s_hv.c:3072:7: error: ‘CPU_UP_PREPARE’ \
>undeclared (first use in this function)
> 
> This fixes the issue introduced by commit <6f3bb80944> ("KVM: PPC:
> Book3S HV: kvmppc_host_rm_ops - handle offlining CPUs").
> 
> Signed-off-by: Gavin Shan 
> ---
I ran into the same thing, fixed it, but forgot to send it out
Reviewed-by: Balbir Singh 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/tm: Clean up duplication of code

2016-05-10 Thread Rashmica Gupta
The same logic for tm_abort appears twice, so pull it out into a
function.

Signed-off-by: Rashmica Gupta 
---
 arch/powerpc/mm/hash_utils_64.c | 47 ++---
 1 file changed, 21 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 7635b1c6b5da..1cef8f5aee9b 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1318,6 +1318,25 @@ out_exit:
local_irq_restore(flags);
 }
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+   /* Transactions are not aborted by tlbiel, only tlbie.
+* Without, syncing a page back to a block device w/ PIO could pick up
+* transactional data (bad!) so we force an abort here.  Before the
+* sync the page will be made read-only, which will flush_hash_page.
+* BIG ISSUE here: if the kernel uses a page from userspace without
+* unmapping it first, it may see the speculated version.
+*/
+static inline void abort_tm(int local)
+{
+   if (local && cpu_has_feature(CPU_FTR_TM) &&
+   current->thread.regs &&
+   MSR_TM_ACTIVE(current->thread.regs->msr)) {
+   tm_enable();
+   tm_abort(TM_CAUSE_TLBI);
+   }
+}
+#endif
+
 /* WARNING: This is called from hash_low_64.S, if you change this prototype,
  *  do not forget to update the assembly call site !
  */
@@ -1344,19 +1363,7 @@ void flush_hash_page(unsigned long vpn, real_pte_t pte, 
int psize, int ssize,
} pte_iterate_hashed_end();
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-   /* Transactions are not aborted by tlbiel, only tlbie.
-* Without, syncing a page back to a block device w/ PIO could pick up
-* transactional data (bad!) so we force an abort here.  Before the
-* sync the page will be made read-only, which will flush_hash_page.
-* BIG ISSUE here: if the kernel uses a page from userspace without
-* unmapping it first, it may see the speculated version.
-*/
-   if (local && cpu_has_feature(CPU_FTR_TM) &&
-   current->thread.regs &&
-   MSR_TM_ACTIVE(current->thread.regs->msr)) {
-   tm_enable();
-   tm_abort(TM_CAUSE_TLBI);
-   }
+   abort_tm(local);
 #endif
 }
 
@@ -1415,19 +1422,7 @@ void flush_hash_hugepage(unsigned long vsid, unsigned 
long addr,
}
 tm_abort:
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-   /* Transactions are not aborted by tlbiel, only tlbie.
-* Without, syncing a page back to a block device w/ PIO could pick up
-* transactional data (bad!) so we force an abort here.  Before the
-* sync the page will be made read-only, which will flush_hash_page.
-* BIG ISSUE here: if the kernel uses a page from userspace without
-* unmapping it first, it may see the speculated version.
-*/
-   if (local && cpu_has_feature(CPU_FTR_TM) &&
-   current->thread.regs &&
-   MSR_TM_ACTIVE(current->thread.regs->msr)) {
-   tm_enable();
-   tm_abort(TM_CAUSE_TLBI);
-   }
+   abort_tm(local);
 #endif
return;
 }
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v9 15/22] powerpc/powernv: Functions to get/set PCI slot state

2016-05-10 Thread Alistair Popple
Gavin,

On Tue, 3 May 2016 23:22:46 Gavin Shan wrote:
> This exports 4 functions, which base on the corresponding OPAL
> APIs to get/set PCI slot status. Those functions are going to
> be used by PowerNV PCI hotplug driver:
> 
>pnv_pci_get_device_tree()opal_get_device_tree()
>pnv_pci_get_presence_state() opal_pci_get_presence_state()
>pnv_pci_get_power_state()opal_pci_get_power_state()
>pnv_pci_set_power_state()opal_pci_set_power_state()
> 
> Besides, the patch also exports pnv_pci_hotplug_notifier_{register,
> unregister}() to allow registration and unregistration of PCI hotplug
> notifier, which will be used to receive PCI hotplug message from
> skiboot firmware in PowerNV PCI hotplug driver.
> 
> Signed-off-by: Gavin Shan 
> Reviewed-by: Alexey Kardashevskiy 
> ---
>  arch/powerpc/include/asm/opal-api.h|  18 -
>  arch/powerpc/include/asm/opal.h|   5 ++
>  arch/powerpc/include/asm/pnv-pci.h |   7 ++
>  arch/powerpc/platforms/powernv/opal-wrappers.S |   5 ++
>  arch/powerpc/platforms/powernv/pci.c   | 102 
+
>  5 files changed, 136 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
> index 9bb8ddf..728e04e 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -158,7 +158,12 @@
>  #define OPAL_LEDS_SET_INDICATOR  115
>  #define OPAL_CEC_REBOOT2 116
>  #define OPAL_CONSOLE_FLUSH   117
> -#define OPAL_LAST117
> +#define OPAL_GET_DEVICE_TREE 118
> +#define OPAL_PCI_GET_PRESENCE_STATE  119
> +#define OPAL_PCI_GET_POWER_STATE 120
> +#define OPAL_PCI_SET_POWER_STATE 121
> +#define OPAL_PCI_POLL2   122
> +#define OPAL_LAST122
>  
>  /* Device tree flags */
>  
> @@ -344,6 +349,16 @@ enum OpalPciResetState {
>   OPAL_ASSERT_RESET   = 1
>  };
>  
> +enum OpalPciSlotPresentenceState {
> + OPAL_PCI_SLOT_EMPTY = 0,
> + OPAL_PCI_SLOT_PRESENT   = 1
> +};
> +
> +enum OpalPciSlotPowerState {
> + OPAL_PCI_SLOT_POWER_OFF = 0,
> + OPAL_PCI_SLOT_POWER_ON  = 1
> +};
> +
>  enum OpalSlotLedType {
>   OPAL_SLOT_LED_TYPE_ID = 0,  /* IDENTIFY LED */
>   OPAL_SLOT_LED_TYPE_FAULT = 1,   /* FAULT LED */
> @@ -378,6 +393,7 @@ enum opal_msg_type {
>   OPAL_MSG_DPO= 5,
>   OPAL_MSG_PRD= 6,
>   OPAL_MSG_OCC= 7,
> + OPAL_MSG_PCI_HOTPLUG= 8,
>   OPAL_MSG_TYPE_MAX,
>  };
>  
> diff --git a/arch/powerpc/include/asm/opal.h 
b/arch/powerpc/include/asm/opal.h
> index 348132c..1a83c80 100644
> --- a/arch/powerpc/include/asm/opal.h
> +++ b/arch/powerpc/include/asm/opal.h
> @@ -209,6 +209,11 @@ int64_t opal_flash_write(uint64_t id, uint64_t offset, 
uint64_t buf,
>   uint64_t size, uint64_t token);
>  int64_t opal_flash_erase(uint64_t id, uint64_t offset, uint64_t size,
>   uint64_t token);
> +int64_t opal_get_device_tree(uint32_t phandle, uint64_t buf, uint64_t len);
> +int64_t opal_pci_get_presence_state(uint64_t id, uint64_t data);
> +int64_t opal_pci_get_power_state(uint64_t id, uint64_t data);
> +int64_t opal_pci_set_power_state(uint64_t id, uint64_t data);
> +int64_t opal_pci_poll2(uint64_t id, uint64_t data);
>  
>  /* Internal functions */
>  extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
> diff --git a/arch/powerpc/include/asm/pnv-pci.h 
b/arch/powerpc/include/asm/pnv-pci.h
> index c607902..8db7439 100644
> --- a/arch/powerpc/include/asm/pnv-pci.h
> +++ b/arch/powerpc/include/asm/pnv-pci.h
> @@ -17,6 +17,13 @@
>  #define PCI_SLOT_ID(phb_id, bdfn)\
>   (PCI_SLOT_ID_PREFIX | ((uint64_t)(bdfn) << 16) | (phb_id))
>  
> +extern int pnv_pci_get_device_tree(uint32_t phandle, void *buf, uint64_t 
len);
> +extern int pnv_pci_get_presence_state(uint64_t id, uint8_t *state);
> +extern int pnv_pci_get_power_state(uint64_t id, uint8_t *state);
> +extern int pnv_pci_set_power_state(uint64_t id, uint8_t state);
> +extern int pnv_pci_hotplug_notifier_register(struct notifier_block *nb);
> +extern int pnv_pci_hotplug_notifier_unregister(struct notifier_block *nb);
> +
>  int pnv_phb_to_cxl_mode(struct pci_dev *dev, uint64_t mode);
>  int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq,
>  unsigned int virq);
> diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S 
b/arch/powerpc/platforms/powernv/opal-wrappers.S
> index e45b88a..60397d2 100644
> --- a/arch/powerpc/platforms/powernv/opal-wrappers.S
> +++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
> @@ -302,3 +302,8 @@ OPAL_CALL(opal_prd_msg,   
OPAL_PRD_MSG);
>  OPAL_CALL(opal_leds_get_ind, OPAL_LEDS_GET_INDICATOR);
>  

Re: [v10, 7/7] mmc: sdhci-of-esdhc: fix host version for T4240-R1.0-R2.0

2016-05-10 Thread Scott Wood
On Thu, 2016-05-05 at 13:10 +0200, Arnd Bergmann wrote:
> On Thursday 05 May 2016 09:41:32 Yangbo Lu wrote:
> > > -Original Message-
> > > From: Arnd Bergmann [mailto:a...@arndb.de]
> > > Sent: Thursday, May 05, 2016 4:32 PM
> > > To: linuxppc-dev@lists.ozlabs.org
> > > Cc: Yangbo Lu; linux-...@vger.kernel.org; devicet...@vger.kernel.org;
> > > linux-arm-ker...@lists.infradead.org; linux-ker...@vger.kernel.org;
> > > linux-...@vger.kernel.org; linux-...@vger.kernel.org; iommu@lists.linux-
> > > foundation.org; net...@vger.kernel.org; Mark Rutland;
> > > ulf.hans...@linaro.org; Russell King; Bhupesh Sharma; Joerg Roedel;
> > > Santosh Shilimkar; Yang-Leo Li; Scott Wood; Rob Herring; Claudiu Manoil;
> > > Kumar Gala; Xiaobo Xie; Qiang Zhao
> > > Subject: Re: [v10, 7/7] mmc: sdhci-of-esdhc: fix host version for T4240-
> > > R1.0-R2.0
> > > 
> > > On Thursday 05 May 2016 11:12:30 Yangbo Lu wrote:
> > > > IIRC, it is the same IP block as i.MX and Arnd's point is this won't
> > > > even compile on !PPC. It is things like this that prevent sharing the
> > > > driver.
> > 
> > The whole point of using the MMIO SVR instead of the PPC SPR is so that
> > it will work on ARM...  The guts driver should build on any platform as
> > long as OF is enabled, and if it doesn't find a node to bind to it will
> > return 0 for SVR, and the eSDHC driver will continue (after printing an
> > error that should be removed) without the ability to test for errata
> > based on SVR.
> 
> It feels like a bad design to have to come up with a different
> method for each SoC type here when they all do the same thing
> and want to identify some variant of the chip to do device
> specific quirks.
> 
> As far as I'm concerned, every driver in drivers/soc that needs to
> export a symbol to be used by a device driver is an indication that
> we don't have the right set of abstractions yet. There are cases
> that are not worth abstracting because the functionality is rather
> obscure and only a couple of drivers for one particular chip
> ever need it.
> 
> Finding out the version of the SoC does not look like this case.

I'm open to new ways of abstracting this, but can that please be discussed
after these patches are merged?  This patchset is fixing a problem, the
existing abstraction is unappealing and not widely adopted, a new abstraction
is not ready, and we're only touching code for our hardware.

Oh, and the existing abstraction isn't even "existing".  I don't see any
examples where soc_device is being used like this -- or even any way for a
driver (the one consuming the information, not the soc "driver") to get a
reference to the soc_device that's been registered short of searching for the
device object by name -- and you're asking for new functionality in
drivers/base/soc.c.

> > > I think the first four patches take care of building for ARM,
> > > but the problem remains if you want to enable COMPILE_TEST as
> > > we need for certain automated checking.
> > 
> > What specific problem is there with COMPILE_TEST?
> 
> COMPILE_TEST is solvable here and the way it is implemented in this
> case (selecting FSL_GUTS from the driver) indeed looks like it works
> correctly, but it's still awkward that this means building the
> SoC specific ID stuff into the vmlinux binary for any driver that
> uses something like that for a particular SoC.

Please keep in mind that this is a Freescale-specific driver... it's not as if
we're attaching this dependency to common SDHCI code.

> 
> > > > Dealing with Si revs is a common problem. We should have a
> > > > common solution. There is soc_device for this purpose.
> > > 
> > > Exactly. The last time this came up, I think we agreed to implement a
> > > helper using glob_match() on the soc_device strings. Unfortunately
> > > this hasn't happened then, but I'd still prefer that over yet another
> > > vendor-specific way of dealing with the generic issue.
> > 
> > soc_device would require encoding the SVR as a string and then decoding
> > the string, which is more complicated and error prone than having
> > platform-specific code test a platform-specific number. 
> 
> You already need to encode it as a string to register the soc_device,

No we don't, because we don't already register a soc_device on arm64 or ppc
(and it looks like whatever does get registered on at least some relevant
arm32 chips is not particularly useful).

> and the driver just needs to pass a glob string, so the only part that
> is missing is the generic function that takes the string from the
> driver and passes that to glob_match for the soc_device.

"just"

And what would the glob look like?

I'd rather not write kernel code as if it were a shell/Perl script.

> > And when would it get registered on arm64, which doesn't have
> > platform code?
> 
> Whenever the soc driver is loaded, as is the case now. The match
> function can return -EPROBE_DEFER if no SoC device is registered
> yet.

That's too late for some places where we need 

Re: [PATCH 2/2] Deduplicate the actual base page size code

2016-05-10 Thread Aneesh Kumar K.V
Balbir Singh  writes:

> On 11/05/16 04:09, Aneesh Kumar K.V wrote:
>> Balbir Singh  writes:
>> 
>>> Deduplicate to one function to compute the actual page size.
>>> Some additional warnings added for AP size as well.
>> 
>> 
>> This is getting chaned in a cleanup series I am testing before posting. 
>> The change from ap to psize need more update in commit message. 
>> 
>> commit 701e0d3dc33c93a97b825f403d58f6be99b89203
>> Author: Aneesh Kumar K.V 
>> Date:   Tue May 10 11:33:15 2016 +0530
>> 
>> powerpc/mm/radix/hugetlb: Add helper for finding page size from hstate
>> 
>> Use the helper instead of open coding the same at multiple place
>> 
>> Signed-off-by: Aneesh Kumar K.V 
>
> This version makes more sense. While we are at it, mind replacing psize with 
> base_psize and asserting
> that base_psize is always 0 or 5.
>

With radix config there is no base/actual page size. The new function is
returning MMU_PAGE_* values and not the AP encoding.

-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc/kvm: Fix build error on book3s_hv.c

2016-05-10 Thread Gavin Shan
When CONFIG_KVM_XICS is enabled, CPU_UP_PREPARE and other macros for
CPU states in linux/cpu.h are needed by arch/powerpc/kvm/book3s_hv.c.
Otherwise, build error as below is seen:

   gwshan@gwshan:~/sandbox/l$ make arch/powerpc/kvm/book3s_hv.o
:
   CC  arch/powerpc/kvm/book3s_hv.o
   arch/powerpc/kvm/book3s_hv.c: In function ‘kvmppc_cpu_notify’:
   arch/powerpc/kvm/book3s_hv.c:3072:7: error: ‘CPU_UP_PREPARE’ \
   undeclared (first use in this function)

This fixes the issue introduced by commit <6f3bb80944> ("KVM: PPC:
Book3S HV: kvmppc_host_rm_ops - handle offlining CPUs").

Signed-off-by: Gavin Shan 
---
 arch/powerpc/kvm/book3s_hv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 4cd37b4..e20beae 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] Deduplicate the actual base page size code

2016-05-10 Thread Balbir Singh


On 11/05/16 04:09, Aneesh Kumar K.V wrote:
> Balbir Singh  writes:
> 
>> Deduplicate to one function to compute the actual page size.
>> Some additional warnings added for AP size as well.
> 
> 
> This is getting chaned in a cleanup series I am testing before posting. 
> The change from ap to psize need more update in commit message. 
> 
> commit 701e0d3dc33c93a97b825f403d58f6be99b89203
> Author: Aneesh Kumar K.V 
> Date:   Tue May 10 11:33:15 2016 +0530
> 
> powerpc/mm/radix/hugetlb: Add helper for finding page size from hstate
> 
> Use the helper instead of open coding the same at multiple place
> 
> Signed-off-by: Aneesh Kumar K.V 

This version makes more sense. While we are at it, mind replacing psize with 
base_psize and asserting
that base_psize is always 0 or 5.

Balbir Singh.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc: Add array bounds checking to crash_shutdown_handlers

2016-05-10 Thread Suraj Jitindar Singh
The array crash_shutdown_handles is an array of size CRASH_HANDLER_MAX+1
containing up to CRASH_HANDLER_MAX shutdown_handlers. It is assumed to
be NULL terminated, which it is under normal circumstances. Array
accesses in the functions crash_shutdown_unregister() and
default_machine_crash_shutdown() rely on this NULL termination property
when traversing this list and don't protect again out of bounds accesses.
If the NULL terminator were somehow overwritten these functions could
potentially access out of the bounds of the array.

Shrink the array to size CRASH_HANDLER_MAX and implement explicit array
bounds checking when accessing the elements of the
crash_shutdown_handles[] array in crash_shutdown_unregister() and
default_machine_crash_shutdown().

Signed-off-by: Suraj Jitindar Singh 
---
 arch/powerpc/kernel/crash.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/crash.c b/arch/powerpc/kernel/crash.c
index 2bb252c..3dc1fad 100644
--- a/arch/powerpc/kernel/crash.c
+++ b/arch/powerpc/kernel/crash.c
@@ -48,8 +48,8 @@ int crashing_cpu = -1;
 static int time_to_dump;
 
 #define CRASH_HANDLER_MAX 3
-/* NULL terminated list of shutdown handles */
-static crash_shutdown_t crash_shutdown_handles[CRASH_HANDLER_MAX+1];
+/* List of shutdown handles */
+static crash_shutdown_t crash_shutdown_handles[CRASH_HANDLER_MAX];
 static DEFINE_SPINLOCK(crash_handlers_lock);
 
 static unsigned long crash_shutdown_buf[JMP_BUF_LEN];
@@ -288,9 +288,14 @@ int crash_shutdown_unregister(crash_shutdown_t handler)
rc = 1;
} else {
/* Shift handles down */
-   for (; crash_shutdown_handles[i]; i++)
+   for (; i < (CRASH_HANDLER_MAX - 1); i++)
crash_shutdown_handles[i] =
crash_shutdown_handles[i+1];
+   /*
+* Reset last entry to NULL now that it has been shifted down,
+* this will allow new handles to be added here.
+*/
+   crash_shutdown_handles[i] = NULL;
rc = 0;
}
 
@@ -346,7 +351,7 @@ void default_machine_crash_shutdown(struct pt_regs *regs)
old_handler = __debugger_fault_handler;
__debugger_fault_handler = handle_fault;
crash_shutdown_cpu = smp_processor_id();
-   for (i = 0; crash_shutdown_handles[i]; i++) {
+   for (i = 0; crash_shutdown_handles[i] && i < CRASH_HANDLER_MAX; i++) {
if (setjmp(crash_shutdown_buf) == 0) {
/*
 * Insert syncs and delay to ensure
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 4/8] powerpc: add io{read,write}64 accessors

2016-05-10 Thread Michael Ellerman
On Tue, 2016-05-10 at 18:50 +, Scott Wood wrote:

> On 05/09/2016 03:20 AM, Horia Ioan Geanta Neag wrote:

> > On 5/5/2016 6:37 PM, Horia Geantă wrote:

> > > This will allow device drivers to consistently use io{read,write}XX
> > > also for 64-bit accesses.
> > > 
> > > Signed-off-by: Horia Geantă 
> > 
> > It would be great if PPC maintainers could Ack this patch.
> > 
> > As stated in the cover letter: https://lkml.org/lkml/2016/5/5/340
> > I'd like to go with the whole patch set via cryptodev-2.6 tree.
> 
> It looks good to me.  Michael?

I didn't get the cover letter, or any of the rest of the series, so although I
saw the patch I had no context. And I didn't have time to chase it up.

At a glance it seems fine, so:

Acked-by: Michael Ellerman 

cheers

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v2] powerpc/sstep.c - Fix emulation fall-through

2016-05-10 Thread Michael Ellerman
On Tue, 2016-16-02 at 06:31:53 UTC, Oliver O'Halloran wrote:
> There is a switch fallthough in instr_analyze() which can cause
> an invalid instruction to be emulated as a different, valid,
> instruction. The rld* (opcode 30) case extracts a sub-opcode from
> bits 3:1 of the instruction word. However, the only valid values
> of this field a 001 and 000. These cases are correctly handled,
> but the others are not which causes execution to fall through
> into case 31.
> 
> Breaking out of the switch causes the instruction to be marked as
> unknown and allows the caller to deal with the invalid instruction
> in a manner consistent with other invalid instructions.
> 
> Signed-off-by: Oliver O'Halloran 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/ab66c8ca52f790d816e421d3b1

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc: Fix sstep compile on powerpcspe

2016-05-10 Thread Michael Ellerman
On Thu, 2016-05-05 at 20:44:44 UTC, Lennart Sorensen wrote:
> powerpc: Fix sstep compile on powerpcspe
> 
> Commit be96f63375a14ee8e690856ac77e579c75bd0bae introduced ldarx and stdcx
> into the instructions in sstep.c, which are not accepted by the assembler
> on powerpcspe, but does seem to be accepted by the normal powerpc assembler
> even in 32 bit mode.
> 
> Wrap these two instructions in a __powerpc64__ check like it is everywhere
> else in the file.
> 
> Fixes: be96f63375a1 ("powerpc: Split out instruction analysis part of 
> emulate_step()")
> Signed-off-by: Len Sorensen 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/03be7e53dee606c189a66a7389

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [kernel, v4, 02/11] vfio/spapr: Relax the IOMMU compatibility check

2016-05-10 Thread Michael Ellerman
On Fri, 2016-29-04 at 08:55:15 UTC, Alexey Kardashevskiy wrote:
> We are going to have multiple different types of PHB on the same system
> with POWER8 + NVLink and PHBs will have different IOMMU ops. However
> we only really care about one callback - create_table - so we can
> relax the compatibility check here.
> 
> Signed-off-by: Alexey Kardashevskiy 
> Reviewed-by: David Gibson 
> Acked-by: Alex Williamson 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/45295687a90a31135ab575803e

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: cxl: Remove duplicate #defines

2016-05-10 Thread Michael Ellerman
On Wed, 2016-04-05 at 04:48:32 UTC, Ian Munsie wrote:
> From: Ian Munsie 
> 
> These defines are not used, but other equivalent definitions
> (CXL_SPA_SW_CMD_*) are used. Remove the unused defines.
> 
> Signed-off-by: Ian Munsie 
> Reviewed-by: Andrew Donnellan 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/29659b682c47986d7d4e6206d9

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: cxl: Handle num_of_processes larger than can fit in the SPA

2016-05-10 Thread Michael Ellerman
On Wed, 2016-04-05 at 04:46:30 UTC, Ian Munsie wrote:
> From: Ian Munsie 
> 
> num_of_process is a 16 bit field, theoretically allowing an AFU to
> support 16K processes, however the scheduled process area currently has
> a maximum size of 1MB, which limits the maximum number of processes to
> 7704.
> 
> Some AFUs may not necessarily care what the limit is and just want to be
> able to use the maximum by setting the field to 16K. To allow these to
> work, detect this situation and use the maximum size for the SPA.
> 
> Downgrade the WARN_ON to a dev_warn.
> 
> Signed-off-by: Ian Munsie 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/eb8724b8dc99462a8fd1fa5734

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: cxl: Ensure PSL interrupt is configured for contexts with no AFU IRQs

2016-05-10 Thread Michael Ellerman
On Wed, 2016-04-05 at 04:52:58 UTC, Ian Munsie wrote:
> From: Ian Munsie 
> 
> In the cxl kernel API, it is possible to create a context and start it
> without allocating any interrupts. Since we assign or allocate the PSL
> interrupt when allocating AFU interrupts this will lead to a situation
> where we start the context with no means to take any faults.
> 
> The user API is not affected as it always goes through the cxl interrupt
> allocation code paths and will have the PSL interrupt allocated or
> assigned, even if no AFU interrupts were requested.
> 
> This checks that at least one interrupt is configured at the time of
> attach, and if not it will assign the multiplexed PSL interrupt for
> powernv, or allocate a single interrupt for PowerVM.
> 
> Signed-off-by: Ian Munsie 
> Reviewed-by: Frederic Barrat 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/b75d94509921cb6d9f475d7a85

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V2] cxl: Check periodically the coherent platform function's state

2016-05-10 Thread Michael Ellerman
On Fri, 2016-22-04 at 13:39:22 UTC, Christophe Lombard wrote:
> In the PowerVM environment, the PHYP CoherentAccel component manages
> the state of the Coherent Accelerator Processor Interface adapter and
> virtualizes CAPI resources, handles CAPP, PSL, PSL Slice errors - and
> interrupts - and provides a new set of hcalls for the OS APIs to utilize
> Accelerator Function Unit (AFU).
> 
> During the course of operation, a coherent platform function can
> encounter errors. Some possible reason for errors are:
> • Hardware recoverable and unrecoverable errors
> • Transient and over-threshold correctable errors
> 
> PHYP implements its own state model for the coherent platform function.
> The state of the AFU is available through a hcall.
> 
> The current implementation of the cxl driver, for the PowerVM
> environment, checks this state of the AFU only when an action is
> requested - open a device, ioctl command, memory map, attach/detach a
> process - from an external driver - cxlflash, libcxl. If an error is
> detected the cxl driver handles the error according the content of the
> Power Architecture Platform Requirements document.
> 
> But in case of low-level troubles (or error injection), the PHYP
> component may reset the card and change the AFU state. The PHYP
> interface doesn't provide any way to be notified when that happens thus
> implies that the cxl driver:
> • cannot handle immediatly the state change of the AFU.
> • cannot notify other drivers (cxlflash, ...)
> 
> The purpose of this patch is to wake up the cpu periodically to check
> the current state of each AFU and to see if we need to enter an error
> recovery path.
> 
> Signed-off-by: Christophe Lombard 
> Acked-by: Ian Munsie 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/6afa221da4fc9bdf6ba2cf7fa8

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v2] cxl: Add kernel API to allow a context to operate with relocate disabled

2016-05-10 Thread Michael Ellerman
On Fri, 2016-06-05 at 07:46:36 UTC, Ian Munsie wrote:
> From: Ian Munsie 
> 
> cxl devices typically access memory using an MMU in much the same way as
> the CPU, and each context includes a state register much like the MSR in
> the CPU. Like the CPU, the state register includes a bit to enable
> relocation, which we currently always enable.
> 
> In some cases, it may be desirable to allow a device to access memory
> using real addresses instead of effective addresses, so this adds a new
> API, cxl_set_translation_mode, that can be used to disable relocation
> on a given kernel context. This can allow for the creation of a special
> privileged context that the device can use if it needs relocation
> disabled, and can use regular contexts at times when it needs relocation
> enabled.
> 
> This interface is only available to users of the kernel API for obvious
> reasons, and will never be supported in a virtualised environment.
> 
> This will be used by the upcoming cxl support in the mlx5 driver.
> 
> Signed-off-by: Ian Munsie 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/9bc8ba0e5d59a84e582004e201

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v2, 2/3] powernv: Rename pSeries to powenv from machine_check_pSeries_early.

2016-05-10 Thread Michael Ellerman
On Tue, 2016-01-03 at 05:47:46 UTC, Mahesh Salgaonkar wrote:
> From: Mahesh Salgaonkar 
> 
> The routine machine_check_pSeries_early() is only used on powernv, not
> pseries. Hence rename machine_check_pSeries_early to
> machine_check_powernv_early.
> 
> Reported-by: Paul Mackerras 
> Signed-off-by: Mahesh Salgaonkar 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/d389b7082652871c62da021c11

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [2/2] powerpc/mm: Improve readability of update_mmu_cache()

2016-05-10 Thread Michael Ellerman
On Fri, 2016-26-02 at 00:26:26 UTC, Gavin Shan wrote:
> The function is used to update the MMU with software PTE. It can
> be called by data access exception handler (0x300) or instruction
> access exception handler (0x400). If the function is called by
> 0x400 handler , the local variable @access is set to _PAGE_EXEC
> to indicate the software PTE should have that flag set. When the
> function is called by 0x300 handler, @access is set to zero.
> 
> This improves the readability of the function by replacing if
> statements with switch. No logical changes introduced.
> 
> Signed-off-by: Gavin Shan 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/f3ad731cc08cc4fd3855f25c36

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [1/5] selftests/powerpc: Make reg.h common to all powerpc selftests

2016-05-10 Thread Michael Ellerman
On Wed, 2015-23-12 at 05:49:50 UTC, Rashmica Gupta wrote:
> Currently there is a reg.h in pmu/ebb that has defines that are useful in
> other powerpc selftests so move this up into selftests/powerpc folder. Also
> include in utils.h - as this is often used in self tests. Add in some other
> useful register defines.

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/5263230effb72fd27d7d9340cc

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v2,1/2] powerpc/mm: define TOP_ZONE as a constant

2016-05-10 Thread Michael Ellerman
On Thu, 2016-05-05 at 07:54:08 UTC, Oliver O'Halloran wrote:
> The zone that contains the top of memory will be either ZONE_NORMAL
> or ZONE_HIGHMEM depending on the kernel config. There are two functions
> that require this information and both of them use an #ifdef to set
> a local variable (top_zone). This is a little silly so lets just make it
> a constant.
> 
> Signed-off-by: Oliver O'Halloran 
> Cc: linux...@kvack.org

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/d69777dbefd707974aed91918d

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v9,01/26] powerpc/pci: Cleanup on struct pci_controller_ops

2016-05-10 Thread Michael Ellerman
On Tue, 2016-03-05 at 05:41:20 UTC, Gavin Shan wrote:
> Each PHB has one instance of "struct pci_controller_ops" that includes
> various callbacks called by PCI subsystem. In the definition of this
> struct, some callbacks have explicit names for its arguments, but the
> left don't have.
> 
> This adds all explicit names of the arguments to the callbacks in
> "struct pci_controller_ops" so that the code looks consistent. Also,
> argument name @dev is replaced by @pdev as the later one is the
> preferred name for PCI device.
> 
> Signed-off-by: Gavin Shan 
> Reviewed-by: Daniel Axtens 
> Reviewed-by: Andrew Donnellan 
> Reviewed-by: Alexey Kardashevskiy 

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/ec73f9a8dfdfce0112ee041705

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: softlockup with 4.6.0-rc3-00130-g4d2a14c

2016-05-10 Thread Balbir Singh
On 11 May 2016 01:05, "Aneesh Kumar K.V" 
wrote:
>
>
> I am finding the below softlockups with kvm guest. This is using the
> same version of kernel for host and guest.
>
> [  323.547841] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
[systemd-timesyn:3116]
> [  323.548023] Modules linked in:
> [  323.548029] CPU: 7 PID: 3116 Comm: systemd-timesyn Not tainted
4.6.0-rc3-00130-g4d2a14c #2
> [  323.548031] task: c00038b16d00 ti: c0003baac000 task.ti:
c0003baac000
> [  323.548032] NIP: c005b404 LR: c0934c68 CTR:
c0099650
> [  323.548033] REGS: c0003baaf9d0 TRAP: 0901   Not tainted
(4.6.0-rc3-00130-g4d2a14c)
> [  323.548034] MSR: 80009033   CR:
48002844  XER: 
> [  323.548040] CFAR: c0934c64 SOFTE: 1
>GPR00: c0934c68 c0003baafc50 c0db3f00
c0e7e978
>GPR04: 0001 81d0 000bdb0f5a4e

>GPR08: c0e207b8 0002 8001

>GPR12: c001cfb0 cfe01c00
> [  323.548055] NIP [c005b404] __spin_yield+0x14/0xa0
> [  323.548059] LR [c0934c68] _raw_spin_lock_irqsave+0x118/0x120
> [  323.548060] Call Trace:
> [  323.548062] [c0003baafc50] [c0934c68]
_raw_spin_lock_irqsave+0x118/0x120 (unreliable)
> [  323.548065] [c0003baafc90] [c0139a6c]
do_adjtimex+0x9c/0x1c0
> [  323.548068] [c0003baafd00] [c013238c]
posix_clock_realtime_adj+0x1c/0x30
> [  323.548070] [c0003baafd20] [c0133920]
SyS_clock_adjtime+0xa0/0x150
> [  323.548073] [c0003baafe30] [c0009260]
system_call+0x38/0x108
> [  323.548074] Instruction dump:
> [  323.548075] eba1ffe8 eb81ffe0 eb61ffd8 4e800020 6000 6000
6000 3c4c00d6
> [  323.548078] 38428b10 8143 2faa 4d9e0020 <79490420> 2b8907ff
79290020 7d101026
>
>
> 
>
> [   21.926941] INFO: rcu_sched self-detected stall on CPU
> [   21.931553]  7-...: (2098 ticks this GP) idle=9b3/141/0
softirq=204/267 fqs=2097
> [   21.931601]   (t=2100 jiffies g=-249 c=-250 q=23178)
> [   21.931751] Task dump for CPU 7:
> [   21.931755] systemd R  running task 9872 1  0
0x00040004
> [   21.931763] Call Trace:
> [   21.931773] [c0003e503630] [c00e783c]
sched_show_task+0xec/0x180 (unreliable)
> [   21.931779] [c0003e5036a0] [c0123504]
rcu_dump_cpu_stacks+0xe4/0x150
> [   21.931783] [c0003e5036f0] [c0128214]
rcu_check_callbacks+0x6b4/0x9c0
> [   21.931804] [c0003e503810] [c012ec7c]
update_process_times+0x4c/0xa0
> [   21.931809] [c0003e503840] [c0143828]
tick_sched_handle.isra.5+0x28/0xb0
> [   21.931812] [c0003e503870] [c014390c]
tick_sched_timer+0x5c/0xd0
> [   21.931816] [c0003e5038b0] [c012f528]
__hrtimer_run_queues+0xf8/0x380
> [   21.931819] [c0003e503930] [c01303e0]
hrtimer_interrupt+0xe0/0x2b0
> [   21.931823] [c0003e5039f0] [c001d57c]
__timer_interrupt+0x8c/0x270
> [   21.931826] [c0003e503a40] [c001dc5c]
timer_interrupt+0x9c/0xe0
> [   21.931830] [c0003e503a70] [c0002750]
decrementer_common+0x150/0x180
> [   21.931834] --- interrupt: 901 at ktime_get_ts64+0xf0/0x150
>LR = ktime_get_ts64+0x74/0x150
> [   21.931836] [c0003e503d60] []   (null)
(unreliable)
> [   21.931841] [c0003e503da0] [c029fa38]
poll_select_set_timeout+0x78/0xd0
> [   21.931844] [c0003e503de0] [c02a1020] SyS_poll+0x80/0x150
> [   21.931847] [c0003e503e30] [c0009260]
system_call+0x38/0x108
> [   24.006941] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 21s!
[systemd:1]
> [   24.007117] Modules linked in:
> [   24.007122] CPU: 7 PID: 1 Comm: systemd Not tainted
4.6.0-rc3-00130-g4d2a14c #1
> [   24.007123] task: c0003e4c ti: c0003e50 task.ti:
c0003e50
> [   24.007125] NIP: c0137400 LR: c0137384 CTR:
c001cfb0
> [   24.007126] REGS: c0003e503ae0 TRAP: 0901   Not tainted
(4.6.0-rc3-00130-g4d2a14c)
> [   24.007126] MSR: 80009033   CR:
28424844  XER: 2000
> [   24.007132] CFAR: c0137414 SOFTE: 1
>GPR00: c029fa38 c0003e503d60 c0db3a00
0025ff39
>GPR04: a8ce0e65 ac491cb5c5ec 5731f19b

>GPR08: 3b9ac9ff 2af484699eac9820 93054a12

>GPR12: c001cfb0 cfe01c00
> [   24.007141] NIP [c0137400] ktime_get_ts64+0xf0/0x150
> [   24.007143] LR [c0137384] ktime_get_ts64+0x74/0x150
> [   24.007143] Call Trace:
> [   24.007145] [c0003e503da0] [c029fa38]
poll_select_set_timeout+0x78/0xd0
> [   24.007146] [c0003e503de0] 

on a MPC8360 system, how can i read the *actual* bus frequencies?

2016-05-10 Thread Robert P. J. Day

  bit of a conundrum here ... we have a legacy MPC8360 system here, on
which we installed linux built with wind river linux 8. we obviously
want to push the various bus frequencies to their max for best
performance, and the device tree that was being used for this system
assigned rather slow speeds (266MHz) for the various buses.

  i wasn't sure how to view the bus frequencies that were *actually*
being used. first, i thought that anything you found under
/proc/devicetree simply showed the device tree values as they were
passed to the kernel, so i wasn't going to trust those.

  also, i thought that anything under /sys would show the genuine
frequency values and, after searching, i found various PPC bus
frequencies under /sys/firmware/..., but they *also* showed fairly
slow speeds.

  finally, someone wrote a program that read directly from the
system registers:

tempVal = *M83XX_SPMR(vxCCSBARGet());
lbcm   = M83XX_SPMR_LBCM_VAL(tempVal);
ddrcm = M83XX_SPMR_DDRCM_VAL(tempVal);
spmf   = M83XX_SPMR_SPMF_VAL(tempVal);
clkDiv  = M83XX_SPMR_CLKID_VAL(tempVal);
corePll = M83XX_SPMR_COREPLL_VAL(tempVal);
cepdf  = M83XX_SPMR_CEPDF_VAL(tempVal);
cepmf  = M83XX_SPMR_CEPMF_VAL(tempVal);

which, surprisingly, showed what appears to be the maximum allowable
bus frequencies; for example

  Enter mcd command -> sysGetCoreSpeed
  value = 52800 = 0x1fca0340

so i'm willing to believe that the system really is running at max
speed, but is there no easier way to see the bus frequencies that are
actually in use, rather than having to dig into the system registers?

  why would the values under /sys not reflect the actual bus
frequencies, and not (as it appears) just the ones passed to the
kernel which were obviously not used?

rday

-- 


Robert P. J. Day Ottawa, Ontario, CANADA
http://crashcourse.ca

Twitter:   http://twitter.com/rpjday
LinkedIn:   http://ca.linkedin.com/in/rpjday


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 4/8] powerpc: add io{read,write}64 accessors

2016-05-10 Thread Scott Wood
On 05/09/2016 03:20 AM, Horia Ioan Geanta Neag wrote:
> On 5/5/2016 6:37 PM, Horia Geantă wrote:
>> This will allow device drivers to consistently use io{read,write}XX
>> also for 64-bit accesses.
>>
>> Signed-off-by: Horia Geantă 
> 
> It would be great if PPC maintainers could Ack this patch.
> 
> As stated in the cover letter: https://lkml.org/lkml/2016/5/5/340
> I'd like to go with the whole patch set via cryptodev-2.6 tree.

It looks good to me.  Michael?

-Scott


> 
> Thanks,
> Horia
> 
>> ---
>>  arch/powerpc/kernel/iomap.c | 24 
>>  1 file changed, 24 insertions(+)
>>
>> diff --git a/arch/powerpc/kernel/iomap.c b/arch/powerpc/kernel/iomap.c
>> index 12e48d56f771..3963f0b68d52 100644
>> --- a/arch/powerpc/kernel/iomap.c
>> +++ b/arch/powerpc/kernel/iomap.c
>> @@ -38,6 +38,18 @@ EXPORT_SYMBOL(ioread16);
>>  EXPORT_SYMBOL(ioread16be);
>>  EXPORT_SYMBOL(ioread32);
>>  EXPORT_SYMBOL(ioread32be);
>> +#ifdef __powerpc64__
>> +u64 ioread64(void __iomem *addr)
>> +{
>> +return readq(addr);
>> +}
>> +u64 ioread64be(void __iomem *addr)
>> +{
>> +return readq_be(addr);
>> +}
>> +EXPORT_SYMBOL(ioread64);
>> +EXPORT_SYMBOL(ioread64be);
>> +#endif /* __powerpc64__ */
>>  
>>  void iowrite8(u8 val, void __iomem *addr)
>>  {
>> @@ -64,6 +76,18 @@ EXPORT_SYMBOL(iowrite16);
>>  EXPORT_SYMBOL(iowrite16be);
>>  EXPORT_SYMBOL(iowrite32);
>>  EXPORT_SYMBOL(iowrite32be);
>> +#ifdef __powerpc64__
>> +void iowrite64(u64 val, void __iomem *addr)
>> +{
>> +writeq(val, addr);
>> +}
>> +void iowrite64be(u64 val, void __iomem *addr)
>> +{
>> +writeq_be(val, addr);
>> +}
>> +EXPORT_SYMBOL(iowrite64);
>> +EXPORT_SYMBOL(iowrite64be);
>> +#endif /* __powerpc64__ */
>>  
>>  /*
>>   * These are the "repeat read/write" functions. Note the
>>
> 
> 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/2] Fix .long's in mm/tlb-radix.c to more meaningful

2016-05-10 Thread Aneesh Kumar K.V
Balbir Singh  writes:

> The .longs with the shifts are harder to read, use more
> meaningful names for the opcodes. PPC_TLBIE_5 is introduced
> for the 5 opcode variation of the instruction due to an existing
> op-code for the 2 opcode variant
>
> Signed-off-by: Balbir Singh 

Reviewed-by: Aneesh Kumar K.V 

> ---
>  arch/powerpc/include/asm/ppc-opcode.h | 14 ++
>  arch/powerpc/mm/tlb-radix.c   | 13 +
>  2 files changed, 19 insertions(+), 8 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
> b/arch/powerpc/include/asm/ppc-opcode.h
> index 1d035c1..c0e9ea4 100644
> --- a/arch/powerpc/include/asm/ppc-opcode.h
> +++ b/arch/powerpc/include/asm/ppc-opcode.h
> @@ -184,6 +184,7 @@
>  #define PPC_INST_STSWX   0x7c00052a
>  #define PPC_INST_STXVD2X 0x7c000798
>  #define PPC_INST_TLBIE   0x7c000264
> +#define PPC_INST_TLBIEL  0x7c000224
>  #define PPC_INST_TLBILX  0x7c24
>  #define PPC_INST_WAIT0x7c7c
>  #define PPC_INST_TLBIVAX 0x7c000624
> @@ -257,6 +258,9 @@
>  #define ___PPC_RB(b) (((b) & 0x1f) << 11)
>  #define ___PPC_RS(s) (((s) & 0x1f) << 21)
>  #define ___PPC_RT(t) ___PPC_RS(t)
> +#define ___PPC_R(r)  (((r) & 0x1) << 16)
> +#define ___PPC_PRS(prs)  (((prs) & 0x1) << 17)
> +#define ___PPC_RIC(ric)  (((ric) & 0x3) << 18)
>  #define __PPC_RA(a)  ___PPC_RA(__REG_##a)
>  #define __PPC_RA0(a) ___PPC_RA(__REGA0_##a)
>  #define __PPC_RB(b)  ___PPC_RB(__REG_##b)
> @@ -321,6 +325,16 @@
>   __PPC_WC(w))
>  #define PPC_TLBIE(lp,a)  stringify_in_c(.long PPC_INST_TLBIE | \
>  ___PPC_RB(a) | ___PPC_RS(lp))
> +#define  PPC_TLBIE_5(rb,rs,ric,prs,r) \
> + stringify_in_c(.long PPC_INST_TLBIE | \
> + ___PPC_RB(rb) | ___PPC_RS(rs) | \
> + ___PPC_RIC(ric) | ___PPC_PRS(prs) | \
> + ___PPC_R(r))
> +#define  PPC_TLBIEL(rb,rs,ric,prs,r) \
> + stringify_in_c(.long PPC_INST_TLBIEL | \
> + ___PPC_RB(rb) | ___PPC_RS(rs) | \
> + ___PPC_RIC(ric) | ___PPC_PRS(prs) | \
> + ___PPC_R(r))
>  #define PPC_TLBSRX_DOT(a,b)  stringify_in_c(.long PPC_INST_TLBSRX_DOT | \
>   __PPC_RA0(a) | __PPC_RB(b))
>  #define PPC_TLBIVAX(a,b) stringify_in_c(.long PPC_INST_TLBIVAX | \
> diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
> index 0fdaf93..e6b7487 100644
> --- a/arch/powerpc/mm/tlb-radix.c
> +++ b/arch/powerpc/mm/tlb-radix.c
> @@ -12,6 +12,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -30,8 +31,7 @@ static inline void __tlbiel_pid(unsigned long pid, int set)
>   ric = 2;  /* invalidate all the caches */
>
>   asm volatile("ptesync": : :"memory");
> - asm volatile(".long 0x7c000224 | (%0 << 11) | (%1 << 16) |"
> -  "(%2 << 17) | (%3 << 18) | (%4 << 21)"
> + asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
>: : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : 
> "memory");
>   asm volatile("ptesync": : :"memory");
>  }
> @@ -60,8 +60,7 @@ static inline void _tlbie_pid(unsigned long pid)
>   ric = 2;  /* invalidate all the caches */
>
>   asm volatile("ptesync": : :"memory");
> - asm volatile(".long 0x7c000264 | (%0 << 11) | (%1 << 16) |"
> -  "(%2 << 17) | (%3 << 18) | (%4 << 21)"
> + asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
>: : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : 
> "memory");
>   asm volatile("eieio; tlbsync; ptesync": : :"memory");
>  }
> @@ -79,8 +78,7 @@ static inline void _tlbiel_va(unsigned long va, unsigned 
> long pid,
>   ric = 0;  /* no cluster flush yet */
>
>   asm volatile("ptesync": : :"memory");
> - asm volatile(".long 0x7c000224 | (%0 << 11) | (%1 << 16) |"
> -  "(%2 << 17) | (%3 << 18) | (%4 << 21)"
> + asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
>: : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : 
> "memory");
>   asm volatile("ptesync": : :"memory");
>  }
> @@ -98,8 +96,7 @@ static inline void _tlbie_va(unsigned long va, unsigned 
> long pid,
>   ric = 0;  /* no cluster flush yet */
>
>   asm volatile("ptesync": : :"memory");
> - asm volatile(".long 0x7c000264 | (%0 << 11) | (%1 << 16) |"
> -  "(%2 << 17) | (%3 << 18) | (%4 << 21)"
> + asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
>: : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : 
> "memory");
>   asm volatile("eieio; 

Re: [PATCH 2/2] Deduplicate the actual base page size code

2016-05-10 Thread Aneesh Kumar K.V
Balbir Singh  writes:

> Deduplicate to one function to compute the actual page size.
> Some additional warnings added for AP size as well.


This is getting chaned in a cleanup series I am testing before posting. 
The change from ap to psize need more update in commit message. 

commit 701e0d3dc33c93a97b825f403d58f6be99b89203
Author: Aneesh Kumar K.V 
Date:   Tue May 10 11:33:15 2016 +0530

powerpc/mm/radix/hugetlb: Add helper for finding page size from hstate

Use the helper instead of open coding the same at multiple place

Signed-off-by: Aneesh Kumar K.V 

diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h 
b/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h
index 60f47649306f..c45189aa7476 100644
--- a/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/hugetlb-radix.h
@@ -11,4 +11,19 @@ extern unsigned long
 radix__hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
unsigned long len, unsigned long pgoff,
unsigned long flags);
+
+static inline int hstate_get_psize(struct hstate *hstate)
+{
+   unsigned long shift;
+
+   shift = huge_page_shift(hstate);
+   if (shift == mmu_psize_defs[MMU_PAGE_2M].shift)
+   return MMU_PAGE_2M;
+   else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift)
+   return MMU_PAGE_1G;
+   else {
+   WARN(1, "Wrong huge page shift\n");
+   return mmu_virtual_psize;
+   }
+}
 #endif
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index 07b2e0031dad..68839e6adcf1 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -21,13 +21,13 @@ extern void radix__flush_tlb_kernel_range(unsigned long 
start, unsigned long end
 extern void radix__local_flush_tlb_mm(struct mm_struct *mm);
 extern void radix__local_flush_tlb_page(struct vm_area_struct *vma, unsigned 
long vmaddr);
 extern void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned 
long vmaddr,
- unsigned long ap);
+ int psize);
 extern void radix__tlb_flush(struct mmu_gather *tlb);
 #ifdef CONFIG_SMP
 extern void radix__flush_tlb_mm(struct mm_struct *mm);
 extern void radix__flush_tlb_page(struct vm_area_struct *vma, unsigned long 
vmaddr);
 extern void radix__flush_tlb_page_psize(struct mm_struct *mm, unsigned long 
vmaddr,
-   unsigned long ap);
+   int psize);
 #else
 #define radix__flush_tlb_mm(mm)radix__local_flush_tlb_mm(mm)
 #define radix__flush_tlb_page(vma,addr)
radix__local_flush_tlb_page(vma,addr)
diff --git a/arch/powerpc/mm/hugetlbpage-radix.c 
b/arch/powerpc/mm/hugetlbpage-radix.c
index 0dfa1816f0c6..1eca0deaf89b 100644
--- a/arch/powerpc/mm/hugetlbpage-radix.c
+++ b/arch/powerpc/mm/hugetlbpage-radix.c
@@ -5,39 +5,24 @@
 #include 
 #include 
 #include 
+#include 
 
 void radix__flush_hugetlb_page(struct vm_area_struct *vma, unsigned long 
vmaddr)
 {
-   unsigned long ap, shift;
+   int psize;
struct hstate *hstate = hstate_file(vma->vm_file);
 
-   shift = huge_page_shift(hstate);
-   if (shift == mmu_psize_defs[MMU_PAGE_2M].shift)
-   ap = mmu_get_ap(MMU_PAGE_2M);
-   else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift)
-   ap = mmu_get_ap(MMU_PAGE_1G);
-   else {
-   WARN(1, "Wrong huge page shift\n");
-   return ;
-   }
-   radix__flush_tlb_page_psize(vma->vm_mm, vmaddr, ap);
+   psize = hstate_get_psize(hstate);
+   radix__flush_tlb_page_psize(vma->vm_mm, vmaddr, psize);
 }
 
 void radix__local_flush_hugetlb_page(struct vm_area_struct *vma, unsigned long 
vmaddr)
 {
-   unsigned long ap, shift;
+   int psize;
struct hstate *hstate = hstate_file(vma->vm_file);
 
-   shift = huge_page_shift(hstate);
-   if (shift == mmu_psize_defs[MMU_PAGE_2M].shift)
-   ap = mmu_get_ap(MMU_PAGE_2M);
-   else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift)
-   ap = mmu_get_ap(MMU_PAGE_1G);
-   else {
-   WARN(1, "Wrong huge page shift\n");
-   return ;
-   }
-   radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, ap);
+   psize = hstate_get_psize(hstate);
+   radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, psize);
 }
 
 /*
diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index b1dc4675925d..7bc3d1402c63 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -128,9 +128,10 @@ void radix__local_flush_tlb_mm(struct mm_struct *mm)
 

Re: [PATCH v5] powerpc/pci: Assign fixed PHB number based on device-tree properties

2016-05-10 Thread Guilherme G. Piccoli

On 05/02/2016 01:57 PM, Bjorn Helgaas wrote:

On Thu, Apr 14, 2016 at 06:55:24PM -0300, Guilherme G. Piccoli wrote:

The domain/PHB field of PCI addresses has its value obtained from a
global variable, incremented each time a new domain (represented by
struct pci_controller) is added on the system. The domain addition
process happens during boot or due to PCI device hotplug.

As recent kernels are using predictable naming for network interfaces,
the network stack is more tied to PCI naming. This can be a problem in
hotplug scenarios, because PCI addresses will change if devices are
removed and then re-added. This situation seems unusual, but it can
happen if a user wants to replace a NIC without rebooting the machine,
for example.

This patch changes the way PCI domain values are generated: now, we use
device-tree properties to assign fixed PHB numbers to PCI addresses
when available (meaning pSeries and PowerNV cases). We also use a bitmap
to allow dynamic PHB numbering when device-tree properties are not
used. This bitmap keeps track of used PHB numbers and if a PHB is
released (by hotplug operations for example), it allows the reuse of
this PHB number, avoiding PCI address to change in case of device remove
and re-add soon after. No functional changes were introduced.

Reviewed-by: Gavin Shan 
Signed-off-by: Guilherme G. Piccoli 


I assume the powerpc guys will take care of this.  Let me know if you
need me to do anything.


Thanks very much Bjorn! I sent this to PCI list to let you know, since 
this modification is PCI nearly related. But it is truly something 
specific to powerpc, so you're right, the linuxpcc-dev list folks will 
take care.


Cheers,


Guilherme




---
  arch/powerpc/kernel/pci-common.c | 66 ++--
  1 file changed, 63 insertions(+), 3 deletions(-)

v5:
   * Improved comments.

   * Changed the the Fixed PHB Numbering to set the PHB number bit
   on the bitmap anyway, avoiding issues when system has virtual PHBs.

   * Changed the device-tree check order - now, firstly we check for
   "ibm,opal-phbid" and if it's not available, we try the pSeries case.

v4:
   * Minor change (if/else nesting rearranged).

v3:
   * Made the bitmap static.

   * Rearranged if/else statements of Fixed PHB checking.

   * Improved bitmap checkings, by removing loop and using instead the
   find_first_zero_bit() function.

   * Removed the single-statement function release_phb_number() by
   adding its logic directly into pcibios_free_controller().

   *Added check for bitmap size before clearing bit, avoiding memory
   corruption.

v2:
   * Added the Fixed PHB Numbering mechanism based on device-tree
   properties.

   * Changed list approach to bitmap on the Dynamic PHB Numbering
   mechanism.

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 0f7a60f..ad423c1 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -41,11 +41,17 @@
  #include 
  #include 

+/* hose_spinlock protects accesses to the the phb_bitmap. */
  static DEFINE_SPINLOCK(hose_spinlock);
  LIST_HEAD(hose_list);

-/* XXX kill that some day ... */
-static int global_phb_number;  /* Global phb counter */
+/* For dynamic PHB numbering on get_phb_number(): max number of PHBs. */
+#defineMAX_PHBS8192
+
+/* For dynamic PHB numbering: used/free PHBs tracking bitmap.
+ * Accesses to this bitmap should be protected by hose_spinlock.
+ */
+static DECLARE_BITMAP(phb_bitmap, MAX_PHBS);

  /* ISA Memory physical address */
  resource_size_t isa_mem_base;
@@ -64,6 +70,55 @@ struct dma_map_ops *get_pci_dma_ops(void)
  }
  EXPORT_SYMBOL(get_pci_dma_ops);

+/* get_phb_number() function should run under locking
+ * protection, specifically hose_spinlock.
+ */
+static int get_phb_number(struct device_node *dn)
+{
+   const __be64 *prop64;
+   const __be32 *regs;
+   int phb_id = 0;
+
+   /* Try fixed PHB numbering first, by checking archs and reading
+* the respective device-tree properties. Firstly, try PowerNV by
+* reading "ibm,opal-phbid", only present in OPAL environment.
+*/
+   prop64 = of_get_property(dn, "ibm,opal-phbid", NULL);
+   if (prop64) {
+   phb_id = (int)(be64_to_cpup(prop64) & 0x);
+
+   } else if (machine_is(pseries)) {
+   regs = of_get_property(dn, "reg", NULL);
+   if (regs)
+   phb_id = (int)(be32_to_cpu(regs[1]) & 0x);
+   } else {
+   goto dynamic_phb_numbering;
+   }
+
+   /* If we have a huge PHB number obtained from device-tree, no need
+* to worry with the bitmap. Otherwise, we need to be sure we're
+* not trying to use the same PHB number twice.
+*/
+   if (phb_id < MAX_PHBS) {
+   if (test_bit(phb_id, phb_bitmap))
+   goto 

Re: [PATCH] kvm-pr: manage illegal instructions

2016-05-10 Thread Paolo Bonzini


On 09/05/2016 10:14, Thomas Huth wrote:
>> > Tested-by: Thomas Huth 
> Ping!
> 
> Alex, Paul, could you please pick up this patch? This patch is required
> to get the kvm-unit-tests working properly with kvm-pr, so I'd be glad
> if we could get this included finally...

I have a pull request for 4.6 final to send tomorrow; do you want me to
include it?

Thanks,

Paolo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] cxl: Add kernel API to allow a context to operate with relocate disabled

2016-05-10 Thread Frederic Barrat

Le 06/05/2016 09:46, Ian Munsie a écrit :

From: Ian Munsie 

cxl devices typically access memory using an MMU in much the same way as
the CPU, and each context includes a state register much like the MSR in
the CPU. Like the CPU, the state register includes a bit to enable
relocation, which we currently always enable.

In some cases, it may be desirable to allow a device to access memory
using real addresses instead of effective addresses, so this adds a new
API, cxl_set_translation_mode, that can be used to disable relocation
on a given kernel context. This can allow for the creation of a special
privileged context that the device can use if it needs relocation
disabled, and can use regular contexts at times when it needs relocation
enabled.

This interface is only available to users of the kernel API for obvious
reasons, and will never be supported in a virtualised environment.

This will be used by the upcoming cxl support in the mlx5 driver.

Signed-off-by: Ian Munsie 



Looks good to me.
Reviewed-by: Frederic Barrat 

  Fred

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 00/23] ata: sata_dwc_460ex: make it working again

2016-05-10 Thread Andy Shevchenko
On Tue, 2016-05-10 at 12:30 -0400, Tejun Heo wrote:
> Hello,
> 
> On Tue, May 10, 2016 at 11:34:40AM +0530, Vinod Koul wrote:
> > 
> > > 
> > > slave-dma [1], branch topic/dw. But I think Vinod can tell us
> > > which
> > > tag/branch will be immutable. Vinod?
> > Please use branch topic/dw. I will not rebase this before sending to
> > Linus.
> Okay, pulled topic/dw into libata/for-4.7-dw and applied 1-22 on top.

Thanks!

> Please let me know how patch 23 should be routed.

Since Rob Acked you may take it as well.

-- 
Andy Shevchenko 
Intel Finland Oy

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 00/23] ata: sata_dwc_460ex: make it working again

2016-05-10 Thread Tejun Heo
Hello,

On Tue, May 10, 2016 at 11:34:40AM +0530, Vinod Koul wrote:
> > slave-dma [1], branch topic/dw. But I think Vinod can tell us which
> > tag/branch will be immutable. Vinod?
> 
> Please use branch topic/dw. I will not rebase this before sending to Linus.

Okay, pulled topic/dw into libata/for-4.7-dw and applied 1-22 on top.
Please let me know how patch 23 should be routed.

Thanks.

-- 
tejun
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2] Fix .long's in mm/tlb-radix.c to more meaningful

2016-05-10 Thread Balbir Singh
The .longs with the shifts are harder to read, use more
meaningful names for the opcodes. PPC_TLBIE_5 is introduced
for the 5 opcode variation of the instruction due to an existing
op-code for the 2 opcode variant

Signed-off-by: Balbir Singh 
---
 arch/powerpc/include/asm/ppc-opcode.h | 14 ++
 arch/powerpc/mm/tlb-radix.c   | 13 +
 2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 1d035c1..c0e9ea4 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -184,6 +184,7 @@
 #define PPC_INST_STSWX 0x7c00052a
 #define PPC_INST_STXVD2X   0x7c000798
 #define PPC_INST_TLBIE 0x7c000264
+#define PPC_INST_TLBIEL0x7c000224
 #define PPC_INST_TLBILX0x7c24
 #define PPC_INST_WAIT  0x7c7c
 #define PPC_INST_TLBIVAX   0x7c000624
@@ -257,6 +258,9 @@
 #define ___PPC_RB(b)   (((b) & 0x1f) << 11)
 #define ___PPC_RS(s)   (((s) & 0x1f) << 21)
 #define ___PPC_RT(t)   ___PPC_RS(t)
+#define ___PPC_R(r)(((r) & 0x1) << 16)
+#define ___PPC_PRS(prs)(((prs) & 0x1) << 17)
+#define ___PPC_RIC(ric)(((ric) & 0x3) << 18)
 #define __PPC_RA(a)___PPC_RA(__REG_##a)
 #define __PPC_RA0(a)   ___PPC_RA(__REGA0_##a)
 #define __PPC_RB(b)___PPC_RB(__REG_##b)
@@ -321,6 +325,16 @@
__PPC_WC(w))
 #define PPC_TLBIE(lp,a)stringify_in_c(.long PPC_INST_TLBIE | \
   ___PPC_RB(a) | ___PPC_RS(lp))
+#definePPC_TLBIE_5(rb,rs,ric,prs,r) \
+   stringify_in_c(.long PPC_INST_TLBIE | \
+   ___PPC_RB(rb) | ___PPC_RS(rs) | \
+   ___PPC_RIC(ric) | ___PPC_PRS(prs) | \
+   ___PPC_R(r))
+#definePPC_TLBIEL(rb,rs,ric,prs,r) \
+   stringify_in_c(.long PPC_INST_TLBIEL | \
+   ___PPC_RB(rb) | ___PPC_RS(rs) | \
+   ___PPC_RIC(ric) | ___PPC_PRS(prs) | \
+   ___PPC_R(r))
 #define PPC_TLBSRX_DOT(a,b)stringify_in_c(.long PPC_INST_TLBSRX_DOT | \
__PPC_RA0(a) | __PPC_RB(b))
 #define PPC_TLBIVAX(a,b)   stringify_in_c(.long PPC_INST_TLBIVAX | \
diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index 0fdaf93..e6b7487 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -30,8 +31,7 @@ static inline void __tlbiel_pid(unsigned long pid, int set)
ric = 2;  /* invalidate all the caches */
 
asm volatile("ptesync": : :"memory");
-   asm volatile(".long 0x7c000224 | (%0 << 11) | (%1 << 16) |"
-"(%2 << 17) | (%3 << 18) | (%4 << 21)"
+   asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
 : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : 
"memory");
asm volatile("ptesync": : :"memory");
 }
@@ -60,8 +60,7 @@ static inline void _tlbie_pid(unsigned long pid)
ric = 2;  /* invalidate all the caches */
 
asm volatile("ptesync": : :"memory");
-   asm volatile(".long 0x7c000264 | (%0 << 11) | (%1 << 16) |"
-"(%2 << 17) | (%3 << 18) | (%4 << 21)"
+   asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
 : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : 
"memory");
asm volatile("eieio; tlbsync; ptesync": : :"memory");
 }
@@ -79,8 +78,7 @@ static inline void _tlbiel_va(unsigned long va, unsigned long 
pid,
ric = 0;  /* no cluster flush yet */
 
asm volatile("ptesync": : :"memory");
-   asm volatile(".long 0x7c000224 | (%0 << 11) | (%1 << 16) |"
-"(%2 << 17) | (%3 << 18) | (%4 << 21)"
+   asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
 : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : 
"memory");
asm volatile("ptesync": : :"memory");
 }
@@ -98,8 +96,7 @@ static inline void _tlbie_va(unsigned long va, unsigned long 
pid,
ric = 0;  /* no cluster flush yet */
 
asm volatile("ptesync": : :"memory");
-   asm volatile(".long 0x7c000264 | (%0 << 11) | (%1 << 16) |"
-"(%2 << 17) | (%3 << 18) | (%4 << 21)"
+   asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
 : : "r"(rb), "i"(r), "i"(prs), "i"(ric), "r"(rs) : 
"memory");
asm volatile("eieio; tlbsync; ptesync": : :"memory");
 }
-- 
2.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

softlockup with 4.6.0-rc3-00130-g4d2a14c

2016-05-10 Thread Aneesh Kumar K.V

I am finding the below softlockups with kvm guest. This is using the
same version of kernel for host and guest.

[  323.547841] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s! 
[systemd-timesyn:3116]
[  323.548023] Modules linked in:
[  323.548029] CPU: 7 PID: 3116 Comm: systemd-timesyn Not tainted 
4.6.0-rc3-00130-g4d2a14c #2
[  323.548031] task: c00038b16d00 ti: c0003baac000 task.ti: 
c0003baac000
[  323.548032] NIP: c005b404 LR: c0934c68 CTR: c0099650
[  323.548033] REGS: c0003baaf9d0 TRAP: 0901   Not tainted  
(4.6.0-rc3-00130-g4d2a14c)
[  323.548034] MSR: 80009033   CR: 48002844  XER: 

[  323.548040] CFAR: c0934c64 SOFTE: 1 
   GPR00: c0934c68 c0003baafc50 c0db3f00 
c0e7e978 
   GPR04: 0001 81d0 000bdb0f5a4e 
 
   GPR08: c0e207b8 0002 8001 
 
   GPR12: c001cfb0 cfe01c00 
[  323.548055] NIP [c005b404] __spin_yield+0x14/0xa0
[  323.548059] LR [c0934c68] _raw_spin_lock_irqsave+0x118/0x120
[  323.548060] Call Trace:
[  323.548062] [c0003baafc50] [c0934c68] 
_raw_spin_lock_irqsave+0x118/0x120 (unreliable)
[  323.548065] [c0003baafc90] [c0139a6c] do_adjtimex+0x9c/0x1c0
[  323.548068] [c0003baafd00] [c013238c] 
posix_clock_realtime_adj+0x1c/0x30
[  323.548070] [c0003baafd20] [c0133920] 
SyS_clock_adjtime+0xa0/0x150
[  323.548073] [c0003baafe30] [c0009260] system_call+0x38/0x108
[  323.548074] Instruction dump:
[  323.548075] eba1ffe8 eb81ffe0 eb61ffd8 4e800020 6000 6000 6000 
3c4c00d6 
[  323.548078] 38428b10 8143 2faa 4d9e0020 <79490420> 2b8907ff 79290020 
7d101026




[   21.926941] INFO: rcu_sched self-detected stall on CPU
[   21.931553]  7-...: (2098 ticks this GP) idle=9b3/141/0 
softirq=204/267 fqs=2097 
[   21.931601]   (t=2100 jiffies g=-249 c=-250 q=23178)
[   21.931751] Task dump for CPU 7:
[   21.931755] systemd R  running task 9872 1  0 0x00040004
[   21.931763] Call Trace:
[   21.931773] [c0003e503630] [c00e783c] sched_show_task+0xec/0x180 
(unreliable)
[   21.931779] [c0003e5036a0] [c0123504] 
rcu_dump_cpu_stacks+0xe4/0x150
[   21.931783] [c0003e5036f0] [c0128214] 
rcu_check_callbacks+0x6b4/0x9c0
[   21.931804] [c0003e503810] [c012ec7c] 
update_process_times+0x4c/0xa0
[   21.931809] [c0003e503840] [c0143828] 
tick_sched_handle.isra.5+0x28/0xb0
[   21.931812] [c0003e503870] [c014390c] tick_sched_timer+0x5c/0xd0
[   21.931816] [c0003e5038b0] [c012f528] 
__hrtimer_run_queues+0xf8/0x380
[   21.931819] [c0003e503930] [c01303e0] 
hrtimer_interrupt+0xe0/0x2b0
[   21.931823] [c0003e5039f0] [c001d57c] 
__timer_interrupt+0x8c/0x270
[   21.931826] [c0003e503a40] [c001dc5c] timer_interrupt+0x9c/0xe0
[   21.931830] [c0003e503a70] [c0002750] 
decrementer_common+0x150/0x180
[   21.931834] --- interrupt: 901 at ktime_get_ts64+0xf0/0x150
   LR = ktime_get_ts64+0x74/0x150
[   21.931836] [c0003e503d60] []   (null) 
(unreliable)
[   21.931841] [c0003e503da0] [c029fa38] 
poll_select_set_timeout+0x78/0xd0
[   21.931844] [c0003e503de0] [c02a1020] SyS_poll+0x80/0x150
[   21.931847] [c0003e503e30] [c0009260] system_call+0x38/0x108
[   24.006941] NMI watchdog: BUG: soft lockup - CPU#7 stuck for 21s! [systemd:1]
[   24.007117] Modules linked in:
[   24.007122] CPU: 7 PID: 1 Comm: systemd Not tainted 4.6.0-rc3-00130-g4d2a14c 
#1
[   24.007123] task: c0003e4c ti: c0003e50 task.ti: 
c0003e50
[   24.007125] NIP: c0137400 LR: c0137384 CTR: c001cfb0
[   24.007126] REGS: c0003e503ae0 TRAP: 0901   Not tainted  
(4.6.0-rc3-00130-g4d2a14c)
[   24.007126] MSR: 80009033   CR: 28424844  XER: 
2000
[   24.007132] CFAR: c0137414 SOFTE: 1 
   GPR00: c029fa38 c0003e503d60 c0db3a00 
0025ff39 
   GPR04: a8ce0e65 ac491cb5c5ec 5731f19b 
 
   GPR08: 3b9ac9ff 2af484699eac9820 93054a12 
 
   GPR12: c001cfb0 cfe01c00 
[   24.007141] NIP [c0137400] ktime_get_ts64+0xf0/0x150
[   24.007143] LR [c0137384] ktime_get_ts64+0x74/0x150
[   24.007143] Call Trace:
[   24.007145] [c0003e503da0] [c029fa38] 
poll_select_set_timeout+0x78/0xd0
[   24.007146] [c0003e503de0] [c02a1020] SyS_poll+0x80/0x150
[   24.007148] [c0003e503e30] [c0009260] system_call+0x38/0x108
[   24.007149] Instruction dump:
[   24.007151] 7ce94e34 7ce43214 

[PATCH 2/2] Deduplicate the actual base page size code

2016-05-10 Thread Balbir Singh
Deduplicate to one function to compute the actual page size.
Some additional warnings added for AP size as well.

Signed-off-by: Balbir Singh 
---
 arch/powerpc/mm/hugetlbpage-radix.c | 31 ---
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/mm/hugetlbpage-radix.c 
b/arch/powerpc/mm/hugetlbpage-radix.c
index 1e11559..9108645 100644
--- a/arch/powerpc/mm/hugetlbpage-radix.c
+++ b/arch/powerpc/mm/hugetlbpage-radix.c
@@ -6,7 +6,8 @@
 #include 
 #include 
 
-void radix__flush_hugetlb_page(struct vm_area_struct *vma, unsigned long 
vmaddr)
+static inline unsigned long get_base_page_size(struct vm_area_struct *vma,
+  unsigned long vmaddr)
 {
unsigned long ap, shift;
struct hstate *hstate = hstate_file(vma->vm_file);
@@ -18,25 +19,25 @@ void radix__flush_hugetlb_page(struct vm_area_struct *vma, 
unsigned long vmaddr)
ap = mmu_get_ap(MMU_PAGE_1G);
else {
WARN(1, "Wrong huge page shift\n");
-   return ;
+   return 0;
}
-   radix___flush_tlb_page(vma->vm_mm, vmaddr, ap, 0);
+#ifdef CONFIG_DEBUG_VM
+   /* Double check this assumption */
+   WARN_ON(ap != 0 && ap != 0x5);
+#endif
+   return ap;
 }
 
-void radix__local_flush_hugetlb_page(struct vm_area_struct *vma, unsigned long 
vmaddr)
+void radix__flush_hugetlb_page(struct vm_area_struct *vma, unsigned long 
vmaddr)
 {
-   unsigned long ap, shift;
-   struct hstate *hstate = hstate_file(vma->vm_file);
+   unsigned long ap = get_base_page_size(vma, vmaddr);
+   radix___flush_tlb_page(vma->vm_mm, vmaddr, ap, 0);
+}
 
-   shift = huge_page_shift(hstate);
-   if (shift == mmu_psize_defs[MMU_PAGE_2M].shift)
-   ap = mmu_get_ap(MMU_PAGE_2M);
-   else if (shift == mmu_psize_defs[MMU_PAGE_1G].shift)
-   ap = mmu_get_ap(MMU_PAGE_1G);
-   else {
-   WARN(1, "Wrong huge page shift\n");
-   return ;
-   }
+void radix__local_flush_hugetlb_page(struct vm_area_struct *vma,
+unsigned long vmaddr)
+{
+   unsigned long ap = get_base_page_size(vma, vmaddr);
radix___local_flush_tlb_page(vma->vm_mm, vmaddr, ap, 0);
 }
 
-- 
2.5.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[v1] Basic Radix Tree cleanups

2016-05-10 Thread Balbir Singh
This patch cleans up some bits of the radix tree implementation
no functionality changes were introduced. Most of them were based
on review comments. I've lightly tested the patches and checked
for correctness of code generation for the .long instruction change
bits. Please review

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

2016-05-10 Thread Miroslav Benes
On Thu, 28 Apr 2016, Josh Poimboeuf wrote:

> Change livepatch to use a basic per-task consistency model.  This is the
> foundation which will eventually enable us to patch those ~10% of
> security patches which change function or data semantics.  This is the
> biggest remaining piece needed to make livepatch more generally useful.
> 
> This code stems from the design proposal made by Vojtech [1] in November
> 2014.  It's a hybrid of kGraft and kpatch: it uses kGraft's per-task
> consistency and syscall barrier switching combined with kpatch's stack
> trace switching.  There are also a number of fallback options which make
> it quite flexible.
> 
> Patches are applied on a per-task basis, when the task is deemed safe to
> switch over.  When a patch is enabled, livepatch enters into a
> transition state where tasks are converging to the patched state.
> Usually this transition state can complete in a few seconds.  The same
> sequence occurs when a patch is disabled, except the tasks converge from
> the patched state to the unpatched state.
> 
> An interrupt handler inherits the patched state of the task it
> interrupts.  The same is true for forked tasks: the child inherits the
> patched state of the parent.
> 
> Livepatch uses several complementary approaches to determine when it's
> safe to patch tasks:
> 
> 1. The first and most effective approach is stack checking of sleeping
>tasks.  If no affected functions are on the stack of a given task,
>the task is patched.  In most cases this will patch most or all of
>the tasks on the first try.  Otherwise it'll keep trying
>periodically.  This option is only available if the architecture has
>reliable stacks (CONFIG_RELIABLE_STACKTRACE and
>CONFIG_STACK_VALIDATION).
> 
> 2. The second approach, if needed, is kernel exit switching.  A
>task is switched when it returns to user space from a system call, a
>user space IRQ, or a signal.  It's useful in the following cases:
> 
>a) Patching I/O-bound user tasks which are sleeping on an affected
>   function.  In this case you have to send SIGSTOP and SIGCONT to
>   force it to exit the kernel and be patched.
>b) Patching CPU-bound user tasks.  If the task is highly CPU-bound
>   then it will get patched the next time it gets interrupted by an
>   IRQ.
>c) Applying patches for architectures which don't yet have
>   CONFIG_RELIABLE_STACKTRACE.  In this case you'll have to signal
>   most of the tasks on the system.  However this isn't a complete
>   solution, because there's currently no way to patch kthreads
>   without CONFIG_RELIABLE_STACKTRACE.
> 
>Note: since idle "swapper" tasks don't ever exit the kernel, they
>instead have a kpatch_patch_task() call in the idle loop which allows

s/kpatch_patch_task()/klp_patch_task()/

[...]

> --- a/Documentation/livepatch/livepatch.txt
> +++ b/Documentation/livepatch/livepatch.txt
> @@ -72,7 +72,8 @@ example, they add a NULL pointer or a boundary check, fix a 
> race by adding
>  a missing memory barrier, or add some locking around a critical section.
>  Most of these changes are self contained and the function presents itself
>  the same way to the rest of the system. In this case, the functions might
> -be updated independently one by one.
> +be updated independently one by one.  (This can be done by setting the
> +'immediate' flag in the klp_patch struct.)
>  
>  But there are more complex fixes. For example, a patch might change
>  ordering of locking in multiple functions at the same time. Or a patch
> @@ -86,20 +87,103 @@ or no data are stored in the modified structures at the 
> moment.
>  The theory about how to apply functions a safe way is rather complex.
>  The aim is to define a so-called consistency model. It attempts to define
>  conditions when the new implementation could be used so that the system
> -stays consistent. The theory is not yet finished. See the discussion at
> -http://thread.gmane.org/gmane.linux.kernel/1823033/focus=1828189
> -
> -The current consistency model is very simple. It guarantees that either
> -the old or the new function is called. But various functions get redirected
> -one by one without any synchronization.
> -
> -In other words, the current implementation _never_ modifies the behavior
> -in the middle of the call. It is because it does _not_ rewrite the entire
> -function in the memory. Instead, the function gets redirected at the
> -very beginning. But this redirection is used immediately even when
> -some other functions from the same patch have not been redirected yet.
> -
> -See also the section "Limitations" below.
> +stays consistent.
> +
> +Livepatch has a consistency model which is a hybrid of kGraft and
> +kpatch:  it uses kGraft's per-task consistency and syscall barrier
> +switching combined with kpatch's stack trace switching.  There are also
> +a number of fallback options which make it quite flexible.
> +
> +Patches are applied on a 

Re: usb: dwc2: regression on MyBook Live Duo / Canyonlands since 4.3.0-rc4

2016-05-10 Thread Arnd Bergmann
On Tuesday 10 May 2016 08:37:52 Benjamin Herrenschmidt wrote:
> On Mon, 2016-05-09 at 17:08 +0200, Arnd Bergmann wrote:
> > 
> > Unfortunately, I don't see any way this could be done in MIPS specific
> > code: There is typically a byteswap between the internal bus and the PCI
> > bus on big-endian MIPS systems, so the PCI MMIO ends up being little-endian,
> 
> Ugh ... not exactly, re-watch my talk on the matter :-) While there is
> a specific lane wiring to preserve byte addresss, in the end it's the
> end device itself that is either BE or LE. Regardless of any "bus
> endianness".

I found your slides on

http://www.linuxplumbersconf.org/2012/wp-content/uploads/2012/09/2012-lpc-ref-big-little-endian-herrenschmidt.odp

but there are at least two more twists that you completely missed here:

- Some architectures (e.g. ARMv5 "BE32" mode in IXP4xx, surely some others)
  do not implement big-endian mode by wiring up the data lines between the
  bus and the CPU differently between big- and little-endian mode like
  powerpc and armv7 "BE8" do, but instead they swizzle the *address* lines
  on 8-bit and 16-bit addresses. The effect of that is that normal RAM
  accesses work as expected both ways, and devices that are accessed using
  32-bit MMIO ops never need any byteswap (you actually get "native
  endian") while MMIO with 8 and 16 bit width does something completely
  unexpected and touches the wrong register. Having an explicit byteswap
  on the PCI host bridge gets you the expected addresses again for 8-bit
  cycles but it also means that readl()/writel() again need to swap the
  data.

- Some other architectures (e.g. Broadcom MIPS) apparently are even fancier
  and use a strapping pin on the SoC flips the endianess of the CPU core
  at the same time as all the peripheral MMIO registers, with the intention
  of never requiring any byte swaps. I believe they are implemented careful
  enough to actually get this right, but it confuses the heck out of
  Linux drivers that don't expect this.

> > which matches the expected behavior of readl/writel. However, drivers
> > for non-PCI devices often use the same readl/writel accessors because
> > that is how it's done on ARMv6/ARMv7.
> 
> Even then, you can have on-SoC (non-PCI) devices that also have a
> different endianness from the main CPU. How does it work on ARM for
> example ? The device endianness should be fixed, regardless of the
> endianness of the core, no ?

ARMv6/v7 is uses BE8 mode like powerpc: each peripheral is fixed-endian
and you have to know what it is. Only Freescale managed to put identical
IP blocks on various (powerpc-derived) SoCs and have a subset of them
treat the access as little-endian while others remain big-endian, so all
those drivers now require runtime detection.

> > Doing it hardcoded by architecture is just the simplest way to deal
> > with it, working on the assumption that nothing actually needs the
> > runtime detection that Ben suggested. 
> 
> No, it's not an archicture problem. It's a problem specific to that one
> SoC that the device was synthetized to be a certain endian while it was
> synthetized differently on another SoC... that also happens to be a
> different architecture. But doesn't have to.
> 
> For example, we had in the past cases of both LE and BE EHCI
> implementations on the same architecture (PowerPC).

I understand this, but from what I see in this history of this particular
driver, all ARM and PowerPC implementations chose to use LE registers for
DWC2 because the normal approach for these is to not mess with endianess,
while presumably all MIPS users of the same block wired up the endian-select
line of the IP block to match that of the CPU core, again because it's
what you are expected to do on a MIPS based SoC.

So hardcoding it per architecture would make an assumption based on
the mindset of the SoC designers rather than strict technical differences,
and that can fail as soon as someone does things differently on any of
them (see the Freescale example), but I still think it's the easiest
workaround for backporting to stable kernels. A revert of the original
patch would be even easier, but that would break the one big-endian
MIPS machine we know about.

> > Detecting the endianess of the
> > device is probably the best future-proof solution, but it's also
> > considerably more work to do in the driver, and comes with a
> > tiny runtime overhead.
> 
> The runtime overhead is probably non-measurable compared with the cost
> of the actual MMIOs.

Right. The code size increase is probably measurable (but still small),
the runtime overhead is not.

Arnd
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/2] Enable ZONE_DEVICE on POWER

2016-05-10 Thread Anshuman Khandual
On 05/08/2016 11:07 PM, Aneesh Kumar K.V wrote:
> oliver  writes:
> 
>> > Hi,
>> >
>> > I've been working on kernel support for a persistent memory (nvdimm)
>> > device and the kernel driver infrastructure requires ZONE_DEVICE for
>> > DAX support.. I've had it enabled in my tree for some time (without
>> > altmap support) without any real issues.
> 
> IIUC a DAX usage of pmem doesn't really need ZONE_DEVICE. It is only
> for struct page backing to those pmem area we need ZONE_DEVICE.
> 
>> >
>> > I wasn't planning on upstreaming any of my changes until 4.8 at the
>> > earliest so I am ok with carrying these patches myself. However, there
>> > has been some interest in using ZONE_DEVICE for other things on ppc
>> > (wasn't that you?) and given that ZONE_DEVICE is gated behind
>> > CONFIG_EXPERT I can't see there being any kind of negative impact on
>> > end users by merging it now. At the very least it lets the rest of the
>> > kernel development community know that changes affecting zones should
>> > also be tested on powerpc.
>> >
>> >
> A partially done patch like that will miss quiet a lot details. For
> example if I look at the x86 changes related to altmap
> (4b94ffdc4163bae1e ("x86, mm: introduce vmem_altmap to augment
> vmemmap_populate")) i see them handling pagetable free and memory
> hotplug. This patch doesn't do any of those. From the commit message it is
> also not clear how we intent to use those zone device memory on ppc64.
> If we say they will not get hotplugged out or they will never be part
> of page table then those changes I mentioned above are really not
> needed. But the patch is missing a lot of those details.

Right. The current patch just enables ZONE_DEVICE with vmem_altmap support
(so that struct pages can be allocated in the device range instead of system
RAM) where any driver can own the rest of the PFNs for its use. Right now
these PFNs will not make into process page table. I can update the commit
message with these details if you like.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev