from:"Bharat Bhushan"

RE: [PATCH] soc: fsl: dpio: fix cpu range check

2019-01-14 Thread Bharat Bhushan



> -Original Message-
> From: Li Yang 
> Sent: Tuesday, January 15, 2019 12:47 AM
> To: Bharat Bhushan 
> Cc: Roy Pledge ; linux-ker...@vger.kernel.org;
> linuxppc-dev@lists.ozlabs.org; linux-arm-ker...@lists.infradead.org;
> bharatb.ya...@gmail.com
> Subject: Re: [PATCH] soc: fsl: dpio: fix cpu range check
> 
> On Sun, Jan 13, 2019 at 11:13 PM Bharat Bhushan
>  wrote:
> >
> > cpu_possible(cpu) will always return true when cpu parameter is from
> > cpumask_next().
> > Check for nr_cpu_ids rather than !cpu_possible(cpu).
> 
> There is another patch pending merge seems to cover this issue too.
> Please let me know if it doesn't.
> 
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flor
> e.kernel.org%2Fpatchwork%2Fpatch%2F1020905%2Fdata=02%7C01%
> 7Cbharat.bhushan%40nxp.com%7C47af6861c3f84094434c08d67a54e1c4%7C6
> 86ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C636830902314446677
> ;sdata=wLPuMGosXOT48Gm9K%2FBjm1IRbGQEWR4D%2FSqRt0I0XyE%3D
> mp;reserved=0

Description and intention of the patch was to fix some other issue but this 
also fixes the cpu range check issue in my patch.
My patch can be ignored.

Thanks
-Bharat

> 
> Leo
> >
> > Signed-off-by: Bharat Bhushan 
> > ---
> >  drivers/soc/fsl/dpio/dpio-driver.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/soc/fsl/dpio/dpio-driver.c
> > b/drivers/soc/fsl/dpio/dpio-driver.c
> > index e58fcc9096e8..eb369dd9e0a7 100644
> > --- a/drivers/soc/fsl/dpio/dpio-driver.c
> > +++ b/drivers/soc/fsl/dpio/dpio-driver.c
> > @@ -133,7 +133,7 @@ static int dpaa2_dpio_probe(struct fsl_mc_device
> *dpio_dev)
> > else
> > next_cpu = cpumask_next(next_cpu, cpu_online_mask);
> >
> > -   if (!cpu_possible(next_cpu)) {
> > +   if (next_cpu >= nr_cpu_ids) {
> > dev_err(dev, "probe failed. Number of DPIOs exceeds
> NR_CPUS.\n");
> > err = -ERANGE;
> > goto err_allocate_irqs;
> > --
> > 2.20.1
> >

[PATCH] soc: fsl: dpio: fix cpu range check

2019-01-13 Thread Bharat Bhushan

cpu_possible(cpu) will always return true when cpu parameter
is from cpumask_next().
Check for nr_cpu_ids rather than !cpu_possible(cpu).

Signed-off-by: Bharat Bhushan 
---
 drivers/soc/fsl/dpio/dpio-driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/fsl/dpio/dpio-driver.c 
b/drivers/soc/fsl/dpio/dpio-driver.c
index e58fcc9096e8..eb369dd9e0a7 100644
--- a/drivers/soc/fsl/dpio/dpio-driver.c
+++ b/drivers/soc/fsl/dpio/dpio-driver.c
@@ -133,7 +133,7 @@ static int dpaa2_dpio_probe(struct fsl_mc_device *dpio_dev)
else
next_cpu = cpumask_next(next_cpu, cpu_online_mask);
 
-   if (!cpu_possible(next_cpu)) {
+   if (next_cpu >= nr_cpu_ids) {
dev_err(dev, "probe failed. Number of DPIOs exceeds 
NR_CPUS.\n");
err = -ERANGE;
goto err_allocate_irqs;
-- 
2.20.1

RE: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for P2020

2018-08-09 Thread Bharat Bhushan



> -Original Message-
> From: Scott Wood [mailto:o...@buserror.net]
> Sent: Thursday, August 9, 2018 11:42 AM
> To: Bharat Bhushan ;
> b...@kernel.crashing.org; pau...@samba.org; m...@ellerman.id.au;
> ga...@kernel.crashing.org; mark.rutl...@arm.com;
> kstew...@linuxfoundation.org; gre...@linuxfoundation.org;
> devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> ker...@vger.kernel.org
> Cc: r...@kernel.org; keesc...@chromium.org; tyr...@linux.vnet.ibm.com;
> j...@perches.com
> Subject: Re: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for P2020
> 
> On Thu, 2018-08-09 at 03:28 +, Bharat Bhushan wrote:
> > > -Original Message-
> > > From: Scott Wood [mailto:o...@buserror.net]
> > > Sent: Wednesday, August 8, 2018 11:27 PM
> > > To: Bharat Bhushan ;
> > > b...@kernel.crashing.org; pau...@samba.org; m...@ellerman.id.au;
> > > ga...@kernel.crashing.org; mark.rutl...@arm.com;
> > > kstew...@linuxfoundation.org; gre...@linuxfoundation.org;
> > > devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> > > ker...@vger.kernel.org
> > > Cc: r...@kernel.org; keesc...@chromium.org;
> > > tyr...@linux.vnet.ibm.com; j...@perches.com
> > > Subject: Re: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for
> > > P2020
> > >
> > > On Wed, 2018-08-08 at 06:28 +, Bharat Bhushan wrote:
> > > > > -Original Message-
> > > > > From: Scott Wood [mailto:o...@buserror.net]
> > > > > Sent: Wednesday, August 8, 2018 11:26 AM
> > > > > To: Bharat Bhushan ;
> > > > > b...@kernel.crashing.org; pau...@samba.org; m...@ellerman.id.au;
> > > > > ga...@kernel.crashing.org; mark.rutl...@arm.com;
> > > > > kstew...@linuxfoundation.org; gre...@linuxfoundation.org;
> > > > > devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org;
> > > > > linux- ker...@vger.kernel.org
> > > > > Cc: r...@kernel.org; keesc...@chromium.org;
> > > > > tyr...@linux.vnet.ibm.com; j...@perches.com
> > > > > Subject: Re: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for
> > > > > P2020
> > > > >
> > > > > On Wed, 2018-08-08 at 03:44 +, Bharat Bhushan wrote:
> > > > > > > -Original Message-
> > > > > > > From: Scott Wood [mailto:o...@buserror.net]
> > > > > > > Sent: Wednesday, August 8, 2018 2:44 AM
> > > > > > > To: Bharat Bhushan ;
> > > > > > > b...@kernel.crashing.org; pau...@samba.org;
> > > > > > > m...@ellerman.id.au; ga...@kernel.crashing.org;
> > > > > > > mark.rutl...@arm.com; kstew...@linuxfoundation.org;
> > > > > > > gre...@linuxfoundation.org; devicet...@vger.kernel.org;
> > > > > > > linuxppc-dev@lists.ozlabs.org;
> > > > > > > linux- ker...@vger.kernel.org
> > > > > > > Cc: r...@kernel.org; keesc...@chromium.org;
> > > > > > > tyr...@linux.vnet.ibm.com; j...@perches.com
> > > > > > > Subject: Re: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges
> > > > > > > for
> > > > > > > P2020
> > > > > > >
> > > > > > > On Fri, 2018-07-27 at 15:18 +0530, Bharat Bhushan wrote:
> > > > > > > > MPIC on NXP (Freescale) P2020 supports following irq
> > > > > > > > ranges:
> > > > > > > >   > 0 - 11  (External interrupt)
> > > > > > > >   > 16 - 79 (Internal interrupt)
> > > > > > > >   > 176 - 183   (Messaging interrupt)
> > > > > > > >   > 224 - 231   (Shared message signaled interrupt)
> > > > > > >
> > > > > > > Why don't you convert to the 4-cell interrupt specifiers
> > > > > > > that make dealing with these ranges less error-prone?
> > > > > >
> > > > > > Ok , will do if we agree to have this series as per comment on
> > > > > > other patch.
> > > > >
> > > > > If you're concerned with errors, this would be a good things to
> > > > > do regardless.
> > > > >  Actually, it seems that p2020si-post.dtsi already uses 4-cell
> > > > > interrupts.
> > > > >
> > > > > What is motivating this patchset?  Is there something wrong in
> > > > > the exis

RE: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for P2020

2018-08-08 Thread Bharat Bhushan



> -Original Message-
> From: Scott Wood [mailto:o...@buserror.net]
> Sent: Wednesday, August 8, 2018 11:27 PM
> To: Bharat Bhushan ;
> b...@kernel.crashing.org; pau...@samba.org; m...@ellerman.id.au;
> ga...@kernel.crashing.org; mark.rutl...@arm.com;
> kstew...@linuxfoundation.org; gre...@linuxfoundation.org;
> devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> ker...@vger.kernel.org
> Cc: r...@kernel.org; keesc...@chromium.org; tyr...@linux.vnet.ibm.com;
> j...@perches.com
> Subject: Re: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for P2020
> 
> On Wed, 2018-08-08 at 06:28 +, Bharat Bhushan wrote:
> > > -Original Message-
> > > From: Scott Wood [mailto:o...@buserror.net]
> > > Sent: Wednesday, August 8, 2018 11:26 AM
> > > To: Bharat Bhushan ;
> > > b...@kernel.crashing.org; pau...@samba.org; m...@ellerman.id.au;
> > > ga...@kernel.crashing.org; mark.rutl...@arm.com;
> > > kstew...@linuxfoundation.org; gre...@linuxfoundation.org;
> > > devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> > > ker...@vger.kernel.org
> > > Cc: r...@kernel.org; keesc...@chromium.org;
> > > tyr...@linux.vnet.ibm.com; j...@perches.com
> > > Subject: Re: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for
> > > P2020
> > >
> > > On Wed, 2018-08-08 at 03:44 +, Bharat Bhushan wrote:
> > > > > -Original Message-
> > > > > From: Scott Wood [mailto:o...@buserror.net]
> > > > > Sent: Wednesday, August 8, 2018 2:44 AM
> > > > > To: Bharat Bhushan ;
> > > > > b...@kernel.crashing.org; pau...@samba.org; m...@ellerman.id.au;
> > > > > ga...@kernel.crashing.org; mark.rutl...@arm.com;
> > > > > kstew...@linuxfoundation.org; gre...@linuxfoundation.org;
> > > > > devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org;
> > > > > linux- ker...@vger.kernel.org
> > > > > Cc: r...@kernel.org; keesc...@chromium.org;
> > > > > tyr...@linux.vnet.ibm.com; j...@perches.com
> > > > > Subject: Re: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for
> > > > > P2020
> > > > >
> > > > > On Fri, 2018-07-27 at 15:18 +0530, Bharat Bhushan wrote:
> > > > > > MPIC on NXP (Freescale) P2020 supports following irq
> > > > > > ranges:
> > > > > >   > 0 - 11  (External interrupt)
> > > > > >   > 16 - 79 (Internal interrupt)
> > > > > >   > 176 - 183   (Messaging interrupt)
> > > > > >   > 224 - 231   (Shared message signaled interrupt)
> > > > >
> > > > > Why don't you convert to the 4-cell interrupt specifiers that
> > > > > make dealing with these ranges less error-prone?
> > > >
> > > > Ok , will do if we agree to have this series as per comment on
> > > > other patch.
> > >
> > > If you're concerned with errors, this would be a good things to do
> > > regardless.
> > >  Actually, it seems that p2020si-post.dtsi already uses 4-cell interrupts.
> > >
> > > What is motivating this patchset?  Is there something wrong in the
> > > existing dts files?
> >
> > There is no error in device tree. Main motivation is to improve code
> > for following reasons:
> >   - While code study it was found that if a reserved irq-number used
> > then there are no check in driver. irq will be configured as correct
> > and interrupt will never fire.
> 
> Again, a wrong interrupt number won't fire, whether an interrupt by that
> number exists or not.  I wouldn't mind a sanity check in the driver if the
> programming model made it properly discoverable, but I don't think it's
> worth messing with device trees just for this (and even less so given that
> there don't seem to be new chips coming out that this would be relevant
> for).

Fair enough, we can use MPIC version to define supported interrupts ranges. 
Will that be acceptable.

Thanks
-Bharat

> 
> > > > One other confusing observation I have is that "irq_count" from
> > > > platform code is given precedence over "last-interrupt-source" in
> > > > device-
> > >
> > > tree.
> > > > Should not device-tree should have precedence otherwise there is
> > > > no point using " last-interrupt-source" if platform code passes
> > > > "irq_count" in mpic_alloc().
> > >
> > > Maybe, though I don't think it matters much given that
> > > last-interrupt- source was only added to avoid having to pass
> > > irq_count in platform code.
> >
> > Thanks for clarifying;
> >
> > My understanding was that "last-interrupt-source" added to ensure that
> > we can over-ride value passed from platform code. In that case we do
> > not need to change code and can control from device tree.
> 
> The changelog says, "To avoid needing to write custom board-specific code
> to detect that scenario, allow it to be easily overridden in the device-tree,"
> where "it" means the value provided by hardware.  The goal was to pass in
> 256 without board code in the kernel, not to override the 256.
> 
> -Scott

RE: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for P2020

2018-08-08 Thread Bharat Bhushan



> -Original Message-
> From: Scott Wood [mailto:o...@buserror.net]
> Sent: Wednesday, August 8, 2018 11:26 AM
> To: Bharat Bhushan ;
> b...@kernel.crashing.org; pau...@samba.org; m...@ellerman.id.au;
> ga...@kernel.crashing.org; mark.rutl...@arm.com;
> kstew...@linuxfoundation.org; gre...@linuxfoundation.org;
> devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> ker...@vger.kernel.org
> Cc: r...@kernel.org; keesc...@chromium.org; tyr...@linux.vnet.ibm.com;
> j...@perches.com
> Subject: Re: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for P2020
> 
> On Wed, 2018-08-08 at 03:44 +, Bharat Bhushan wrote:
> > > -Original Message-
> > > From: Scott Wood [mailto:o...@buserror.net]
> > > Sent: Wednesday, August 8, 2018 2:44 AM
> > > To: Bharat Bhushan ;
> > > b...@kernel.crashing.org; pau...@samba.org; m...@ellerman.id.au;
> > > ga...@kernel.crashing.org; mark.rutl...@arm.com;
> > > kstew...@linuxfoundation.org; gre...@linuxfoundation.org;
> > > devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> > > ker...@vger.kernel.org
> > > Cc: r...@kernel.org; keesc...@chromium.org;
> > > tyr...@linux.vnet.ibm.com; j...@perches.com
> > > Subject: Re: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for
> > > P2020
> > >
> > > On Fri, 2018-07-27 at 15:18 +0530, Bharat Bhushan wrote:
> > > > MPIC on NXP (Freescale) P2020 supports following irq
> > > > ranges:
> > > >   > 0 - 11  (External interrupt)
> > > >   > 16 - 79 (Internal interrupt)
> > > >   > 176 - 183   (Messaging interrupt)
> > > >   > 224 - 231   (Shared message signaled interrupt)
> > >
> > > Why don't you convert to the 4-cell interrupt specifiers that make
> > > dealing with these ranges less error-prone?
> >
> > Ok , will do if we agree to have this series as per comment on other patch.
> 
> If you're concerned with errors, this would be a good things to do regardless.
>  Actually, it seems that p2020si-post.dtsi already uses 4-cell interrupts.
> 
> What is motivating this patchset?  Is there something wrong in the existing
> dts files?

There is no error in device tree. Main motivation is to improve code for 
following reasons: 
  - While code study it was found that if a reserved irq-number used then there 
are no check in driver. irq will be configured as correct and interrupt will 
never fire.
 - Warnings were observed on development platform (simulator) when read/write 
to reserved MPIC reason during init.
  
> 
> 
> >
> > >
> > > > diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
> > > > b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
> > > > index 1006950..49ff348 100644
> > > > --- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
> > > > +++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
> > > > @@ -57,6 +57,11 @@ void __init mpc85xx_rdb_pic_init(void)
> > > > MPIC_BIG_ENDIAN |
> > > > MPIC_SINGLE_DEST_CPU,
> > > > 0, 256, " OpenPIC  ");
> > > > +   } else if (of_machine_is_compatible("fsl,P2020RDB-PC")) {
> > > > +   mpic = mpic_alloc(NULL, 0,
> > > > + MPIC_BIG_ENDIAN |
> > > > + MPIC_SINGLE_DEST_CPU,
> > > > + 0, 0, " OpenPIC  ");
> > > > } else {
> > > > mpic = mpic_alloc(NULL, 0,
> > > >   MPIC_BIG_ENDIAN |
> > >
> > > I don't think we want to grow a list of every single revision of
> > > every board in these platform files.
> >
> > One other confusing observation I have is that "irq_count" from
> > platform code is given precedence over "last-interrupt-source" in device-
> tree.
> > Should not device-tree should have precedence otherwise there is no
> > point using " last-interrupt-source" if platform code passes
> > "irq_count" in mpic_alloc().
> 
> Maybe, though I don't think it matters much given that last-interrupt-source
> was only added to avoid having to pass irq_count in platform code.

Thanks for clarifying;

My understanding was that "last-interrupt-source" added to ensure that we can 
over-ride value passed from platform code. In that case we do not need to 
change code and can control from device tree.

Thanks
-Bharat


> 
> -Scott

RE: [RFC 3/5] powerpc/mpic: Add support for non-contiguous irq ranges

2018-08-07 Thread Bharat Bhushan



> -Original Message-
> From: Scott Wood [mailto:o...@buserror.net]
> Sent: Wednesday, August 8, 2018 11:21 AM
> To: Bharat Bhushan ; Rob Herring
> 
> Cc: b...@kernel.crashing.org; pau...@samba.org; m...@ellerman.id.au;
> ga...@kernel.crashing.org; mark.rutl...@arm.com;
> kstew...@linuxfoundation.org; gre...@linuxfoundation.org;
> devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> ker...@vger.kernel.org; keesc...@chromium.org;
> tyr...@linux.vnet.ibm.com; j...@perches.com
> Subject: Re: [RFC 3/5] powerpc/mpic: Add support for non-contiguous irq
> ranges
> 
> On Wed, 2018-08-08 at 03:37 +, Bharat Bhushan wrote:
> > > -Original Message-
> > > From: Scott Wood [mailto:o...@buserror.net]
> > > Sent: Wednesday, August 8, 2018 2:34 AM
> > > To: Rob Herring ; Bharat Bhushan
> > > 
> > > Cc: b...@kernel.crashing.org; pau...@samba.org; m...@ellerman.id.au;
> > > ga...@kernel.crashing.org; mark.rutl...@arm.com;
> > > kstew...@linuxfoundation.org; gre...@linuxfoundation.org;
> > > devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> > > ker...@vger.kernel.org; keesc...@chromium.org;
> > > tyr...@linux.vnet.ibm.com; j...@perches.com
> > > Subject: Re: [RFC 3/5] powerpc/mpic: Add support for non-contiguous
> > > irq ranges
> > >
> > > On Tue, 2018-08-07 at 12:09 -0600, Rob Herring wrote:
> > > > On Fri, Jul 27, 2018 at 03:17:59PM +0530, Bharat Bhushan wrote:
> > > > > Freescale MPIC h/w may not support all interrupt sources
> > > > > reported by hardware, "last-interrupt-source" or platform. On
> > > > > these platforms a misconfigured device tree that assigns one of
> > > > > the reserved interrupts leaves a non-functioning system without
> warning.
> > > >
> > > > There are lots of ways to misconfigure DTs. I don't think this is
> > > > special and needs a property.
> > >
> > > Yeah, the system will be just as non-functioning if you specify a
> > > valid-
> > > but-
> > > wrong-for-the-device interrupt number.
> >
> > Some is one additional benefits of this changes, MPIC have reserved
> > regions for un-supported interrupts and read/writes to these reserved
> > regions seams have no effect.
> > MPIC driver reads/writes to the reserved regions during init/uninit
> > and save/restore state.
> >
> > Let me know if it make sense to have these changes for mentioned
> reasons.
> 
> The driver has been doing this forever with no ill effect.

Yes, there are no issue reported

>  What is the  motivation for this change?

On Simulation model I see warning when accessing the reserved region, So this 
patch is just an effort to improve.

Thanks
-Bharat

> 
> -Scott

RE: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for P2020

2018-08-07 Thread Bharat Bhushan



> -Original Message-
> From: Scott Wood [mailto:o...@buserror.net]
> Sent: Wednesday, August 8, 2018 2:44 AM
> To: Bharat Bhushan ;
> b...@kernel.crashing.org; pau...@samba.org; m...@ellerman.id.au;
> ga...@kernel.crashing.org; mark.rutl...@arm.com;
> kstew...@linuxfoundation.org; gre...@linuxfoundation.org;
> devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> ker...@vger.kernel.org
> Cc: r...@kernel.org; keesc...@chromium.org; tyr...@linux.vnet.ibm.com;
> j...@perches.com
> Subject: Re: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for P2020
> 
> On Fri, 2018-07-27 at 15:18 +0530, Bharat Bhushan wrote:
> > MPIC on NXP (Freescale) P2020 supports following irq
> > ranges:
> >   > 0 - 11  (External interrupt)
> >   > 16 - 79 (Internal interrupt)
> >   > 176 - 183   (Messaging interrupt)
> >   > 224 - 231   (Shared message signaled interrupt)
> 
> Why don't you convert to the 4-cell interrupt specifiers that make dealing
> with these ranges less error-prone?

Ok , will do if we agree to have this series as per comment on other patch.

> 
> > diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
> > b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
> > index 1006950..49ff348 100644
> > --- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
> > +++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
> > @@ -57,6 +57,11 @@ void __init mpc85xx_rdb_pic_init(void)
> > MPIC_BIG_ENDIAN |
> > MPIC_SINGLE_DEST_CPU,
> > 0, 256, " OpenPIC  ");
> > +   } else if (of_machine_is_compatible("fsl,P2020RDB-PC")) {
> > +   mpic = mpic_alloc(NULL, 0,
> > + MPIC_BIG_ENDIAN |
> > + MPIC_SINGLE_DEST_CPU,
> > + 0, 0, " OpenPIC  ");
> > } else {
> > mpic = mpic_alloc(NULL, 0,
> >   MPIC_BIG_ENDIAN |
> 
> I don't think we want to grow a list of every single revision of every board 
> in
> these platform files.

One other confusing observation I have is that "irq_count" from platform code 
is given precedence over "last-interrupt-source" in device-tree.
Should not device-tree should have precedence otherwise there is no point using 
" last-interrupt-source" if platform code passes "irq_count" in mpic_alloc().

Thanks
-Bharat

> 
> -Scott

RE: [RFC 3/5] powerpc/mpic: Add support for non-contiguous irq ranges

2018-08-07 Thread Bharat Bhushan



> -Original Message-
> From: Scott Wood [mailto:o...@buserror.net]
> Sent: Wednesday, August 8, 2018 2:34 AM
> To: Rob Herring ; Bharat Bhushan
> 
> Cc: b...@kernel.crashing.org; pau...@samba.org; m...@ellerman.id.au;
> ga...@kernel.crashing.org; mark.rutl...@arm.com;
> kstew...@linuxfoundation.org; gre...@linuxfoundation.org;
> devicet...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> ker...@vger.kernel.org; keesc...@chromium.org;
> tyr...@linux.vnet.ibm.com; j...@perches.com
> Subject: Re: [RFC 3/5] powerpc/mpic: Add support for non-contiguous irq
> ranges
> 
> On Tue, 2018-08-07 at 12:09 -0600, Rob Herring wrote:
> > On Fri, Jul 27, 2018 at 03:17:59PM +0530, Bharat Bhushan wrote:
> > > Freescale MPIC h/w may not support all interrupt sources reported by
> > > hardware, "last-interrupt-source" or platform. On these platforms a
> > > misconfigured device tree that assigns one of the reserved
> > > interrupts leaves a non-functioning system without warning.
> >
> > There are lots of ways to misconfigure DTs. I don't think this is
> > special and needs a property.
> 
> Yeah, the system will be just as non-functioning if you specify a valid-but-
> wrong-for-the-device interrupt number.

Some is one additional benefits of this changes, MPIC have reserved regions for 
un-supported interrupts and read/writes to these reserved regions seams have no 
effect.
MPIC driver reads/writes to the reserved regions during init/uninit and 
save/restore state.

Let me know if it make sense to have these changes for mentioned reasons.

Thanks
-Bharat

> 
> >  We've had some interrupt mask or valid properties in the past, but
> > generally don't accept those.
> 
> FWIW, some of them like protected-sources and mpic-msgr-receive-mask
> aren't for detecting errors, but are for partitioning (though the former is
> obsolete with pic-no-reset).
> 
> -Scott

[RFC 5/5] powerpc/fsl: Add supported-irq-ranges for P2020

2018-07-27 Thread Bharat Bhushan

MPIC on NXP (Freescale) P2020 supports following irq
ranges:
  > 0 - 11  (External interrupt)
  > 16 - 79 (Internal interrupt)
  > 176 - 183   (Messaging interrupt)
  > 224 - 231   (Shared message signaled interrupt)

We have to remove "irq_count" from platform code as platform
is given precedence over device-tree, while I think device-tree
should have precedence.

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/boot/dts/fsl/p2020si-post.dtsi | 3 +++
 arch/powerpc/platforms/85xx/mpc85xx_rdb.c   | 5 +
 2 files changed, 8 insertions(+)

diff --git a/arch/powerpc/boot/dts/fsl/p2020si-post.dtsi 
b/arch/powerpc/boot/dts/fsl/p2020si-post.dtsi
index 884e01b..08e266b 100644
--- a/arch/powerpc/boot/dts/fsl/p2020si-post.dtsi
+++ b/arch/powerpc/boot/dts/fsl/p2020si-post.dtsi
@@ -192,6 +192,9 @@
 /include/ "pq3-sec3.1-0.dtsi"
 /include/ "pq3-mpic.dtsi"
 /include/ "pq3-mpic-timer-B.dtsi"
+   pic@4 {
+   supported-irq-ranges = <0 11 16 79 176 183 224 231>;
+   };
 
global-utilities@e {
compatible = "fsl,p2020-guts";
diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c 
b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
index 1006950..49ff348 100644
--- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
+++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
@@ -57,6 +57,11 @@ void __init mpc85xx_rdb_pic_init(void)
MPIC_BIG_ENDIAN |
MPIC_SINGLE_DEST_CPU,
0, 256, " OpenPIC  ");
+   } else if (of_machine_is_compatible("fsl,P2020RDB-PC")) {
+   mpic = mpic_alloc(NULL, 0,
+ MPIC_BIG_ENDIAN |
+ MPIC_SINGLE_DEST_CPU,
+ 0, 0, " OpenPIC  ");
} else {
mpic = mpic_alloc(NULL, 0,
  MPIC_BIG_ENDIAN |
-- 
1.9.3

[RFC 4/5] powerpc/mpic: Boot print supported interrupt ranges

2018-07-27 Thread Bharat Bhushan

As mpic can have non-contiguous source of interrupt range,
print same during boot.

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/sysdev/mpic.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
index cbf3a51..8df248f 100644
--- a/arch/powerpc/sysdev/mpic.c
+++ b/arch/powerpc/sysdev/mpic.c
@@ -155,6 +155,21 @@ struct bus_type mpic_subsys = {
 
 #endif /* CONFIG_MPIC_WEIRD */
 
+static void mpic_show_irq_ranges(struct mpic *mpic)
+{
+   int i;
+
+   pr_info("mpic: Initializing for %d sources\n", mpic->num_sources);
+
+   if (mpic->num_ranges) {
+   pr_info(" Supported source of interrupt ranges\n");
+   for (i = 0; i < mpic->num_ranges; i++)
+   pr_info("  > %d - %d\n", mpic->irq_ranges[i].start_irq,
+   mpic->irq_ranges[i].end_irq);
+
+   }
+}
+
 static int mpic_irq_source_invalid(struct mpic *mpic, unsigned int irq)
 {
int i;
@@ -1646,8 +1661,7 @@ void __init mpic_init(struct mpic *mpic)
int num_timers = 4;
 
BUG_ON(mpic->num_sources == 0);
-
-   printk(KERN_INFO "mpic: Initializing for %d sources\n", 
mpic->num_sources);
+   mpic_show_irq_ranges(mpic);
 
/* Set current processor priority to max */
mpic_cpu_write(MPIC_INFO(CPU_CURRENT_TASK_PRI), 0xf);
-- 
1.9.3

[RFC 3/5] powerpc/mpic: Add support for non-contiguous irq ranges

2018-07-27 Thread Bharat Bhushan

Freescale MPIC h/w may not support all interrupt sources reported
by hardware, "last-interrupt-source" or platform. On these platforms
a misconfigured device tree that assigns one of the reserved
interrupts leaves a non-functioning system without warning.

This patch adds "supported-irq-ranges" property in device tree to
provide the range of supported source of interrupts. If a reserved
interrupt used then it will not be programming h/w, which it does
currently, and through warning.

Signed-off-by: Bharat Bhushan 
---
 .../devicetree/bindings/powerpc/fsl/mpic.txt   |   8 ++
 arch/powerpc/include/asm/mpic.h|   9 ++
 arch/powerpc/sysdev/mpic.c | 113 +++--
 3 files changed, 121 insertions(+), 9 deletions(-)

diff --git a/Documentation/devicetree/bindings/powerpc/fsl/mpic.txt 
b/Documentation/devicetree/bindings/powerpc/fsl/mpic.txt
index dc57446..bd6da54 100644
--- a/Documentation/devicetree/bindings/powerpc/fsl/mpic.txt
+++ b/Documentation/devicetree/bindings/powerpc/fsl/mpic.txt
@@ -77,6 +77,14 @@ PROPERTIES
   in the global feature registers.  If specified, this field will
   override the value read from MPIC_GREG_FEATURE_LAST_SRC.
 
+ - supported-irq-ranges
+  Usage: optional
+  Value type: 
+  Definition: This encodes arbitrary number of start-irq and end-irq
+  pairs, both including. Interrupt source supported by an MPIC
+  may not be contigous, in that case this property will be used
+  to pass supported source of interrupt ranges.
+
 INTERRUPT SPECIFIER DEFINITION
 
   Interrupt specifiers consists of 4 cells encoded as
diff --git a/arch/powerpc/include/asm/mpic.h b/arch/powerpc/include/asm/mpic.h
index fad8ddd..4080c98 100644
--- a/arch/powerpc/include/asm/mpic.h
+++ b/arch/powerpc/include/asm/mpic.h
@@ -252,6 +252,11 @@ struct mpic_irq_save {
 #endif
 };
 
+struct mpic_irq_range {
+   u32 start_irq;
+   u32 end_irq;
+};
+
 /* The instance data of a given MPIC */
 struct mpic
 {
@@ -281,6 +286,10 @@ struct mpic
/* Number of sources */
unsigned intnum_sources;
 
+   /* Supported source ranges */
+   unsigned int num_ranges;
+   struct mpic_irq_range   *irq_ranges;
+
/* vector numbers used for internal sources (ipi/timers) */
unsigned intipi_vecs[4];
unsigned inttimer_vecs[8];
diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
index d503887..cbf3a51 100644
--- a/arch/powerpc/sysdev/mpic.c
+++ b/arch/powerpc/sysdev/mpic.c
@@ -155,6 +155,23 @@ struct bus_type mpic_subsys = {
 
 #endif /* CONFIG_MPIC_WEIRD */
 
+static int mpic_irq_source_invalid(struct mpic *mpic, unsigned int irq)
+{
+   int i;
+
+   for (i = 0; i < mpic->num_ranges; i++) {
+   if ((irq >= mpic->irq_ranges[i].start_irq) &&
+   (irq <= mpic->irq_ranges[i].end_irq))
+   return 0;
+   }
+
+   /* if not supported irq-ranges then check for num_sources */
+   if (!mpic->num_ranges && irq < mpic->num_sources)
+   return 0;
+
+   return -EINVAL;
+}
+
 static inline unsigned int mpic_processor_id(struct mpic *mpic)
 {
unsigned int cpu = 0;
@@ -873,8 +890,10 @@ int mpic_set_irq_type(struct irq_data *d, unsigned int 
flow_type)
DBG("mpic: set_irq_type(mpic:@%p,virq:%d,src:0x%x,type:0x%x)\n",
mpic, d->irq, src, flow_type);
 
-   if (src >= mpic->num_sources)
+   if (mpic_irq_source_invalid(mpic, src)) {
+   WARN(1, "mpic: Reserved IRQ source %d\n", src);
return -EINVAL;
+   }
 
vold = mpic_irq_read(src, MPIC_INFO(IRQ_VECTOR_PRI));
 
@@ -933,8 +952,10 @@ void mpic_set_vector(unsigned int virq, unsigned int 
vector)
DBG("mpic: set_vector(mpic:@%p,virq:%d,src:%d,vector:0x%x)\n",
mpic, virq, src, vector);
 
-   if (src >= mpic->num_sources)
+   if (mpic_irq_source_invalid(mpic, src)) {
+   WARN(1, "mpic: Reserved IRQ source %d\n", src);
return;
+   }
 
vecpri = mpic_irq_read(src, MPIC_INFO(IRQ_VECTOR_PRI));
vecpri = vecpri & ~MPIC_INFO(VECPRI_VECTOR_MASK);
@@ -950,8 +971,10 @@ static void mpic_set_destination(unsigned int virq, 
unsigned int cpuid)
DBG("mpic: set_destination(mpic:@%p,virq:%d,src:%d,cpuid:0x%x)\n",
mpic, virq, src, cpuid);
 
-   if (src >= mpic->num_sources)
+   if (mpic_irq_source_invalid(mpic, src)) {
+   WARN(1, "mpic: Reserved IRQ source %d\n", src);
return;
+   }
 
mpic_irq_write(src, MPIC_INFO(IRQ_DESTINATION), 1 << cpuid);
 }
@@ -1038,7 +1061,7 @@ static int mpic_host_map(struct irq_domain *h, unsigned 
int virq,

[RFC 2/5] powerpc/mpic: Rework last source irq calculation logic

2018-07-27 Thread Bharat Bhushan

Last irq calculation logic uses below priority order:
  1) irq_count from platform
  2) "last-interrupt-source" from device tree
  3) isu_size from platform
  4) MPIC h/w GREG_FEATURE_0 register

This patch reworks the last irq calculation logic but
functionality and priority order are same as before.

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/sysdev/mpic.c | 31 +++
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
index b6803bc..d503887 100644
--- a/arch/powerpc/sysdev/mpic.c
+++ b/arch/powerpc/sysdev/mpic.c
@@ -1217,25 +1217,32 @@ static int mpic_get_last_irq_source(struct mpic *mpic,
u32 last_irq;
u32 greg_feature;
 
+   /* Current priority order for getting last irq:
+*  1) irq_count from platform
+*  2) "last-interrupt-source" from device tree
+*  3) isu_size from platform
+*  4) MPIC h/w GREG_FEATURE_0 register
+*/
+
+   if (irq_count)
+   return (irq_count - 1);
+
+   if (!of_property_read_u32(mpic->node, "last-interrupt-source",
+ _irq)) {
+   return last_irq;
+   }
+
+   if (isu_size)
+   return (isu_size  * MPIC_MAX_ISU - 1);
+
/*
-* Read feature register.  For non-ISU MPICs, num sources as well. On
+* Read feature register. For non-ISU MPICs, num sources as well. On
 * ISU MPICs, sources are counted as ISUs are added
 */
greg_feature = mpic_read(mpic->gregs, MPIC_INFO(GREG_FEATURE_0));
 
-   /*
-* By default, the last source number comes from the MPIC, but the
-* device-tree and board support code can override it on buggy hw.
-* If we get passed an isu_size (multi-isu MPIC) then we use that
-* as a default instead of the value read from the HW.
-*/
last_irq = (greg_feature & MPIC_GREG_FEATURE_LAST_SRC_MASK)
>> MPIC_GREG_FEATURE_LAST_SRC_SHIFT;
-   if (isu_size)
-   last_irq = isu_size  * MPIC_MAX_ISU - 1;
-   of_property_read_u32(mpic->node, "last-interrupt-source", _irq);
-   if (irq_count)
-   last_irq = irq_count - 1;
 
return last_irq;
 }
-- 
1.9.3

[RFC 1/5] powerpc/mpic: move last irq logic to function

2018-07-27 Thread Bharat Bhushan

This function just moves the last-irq calculation
logic to a function, while no change in logic.

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/sysdev/mpic.c | 52 +-
 1 file changed, 33 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
index 353b439..b6803bc 100644
--- a/arch/powerpc/sysdev/mpic.c
+++ b/arch/powerpc/sysdev/mpic.c
@@ -1210,6 +1210,36 @@ u32 fsl_mpic_primary_get_version(void)
return 0;
 }
 
+static int mpic_get_last_irq_source(struct mpic *mpic,
+   unsigned int irq_count,
+   unsigned int isu_size)
+{
+   u32 last_irq;
+   u32 greg_feature;
+
+   /*
+* Read feature register.  For non-ISU MPICs, num sources as well. On
+* ISU MPICs, sources are counted as ISUs are added
+*/
+   greg_feature = mpic_read(mpic->gregs, MPIC_INFO(GREG_FEATURE_0));
+
+   /*
+* By default, the last source number comes from the MPIC, but the
+* device-tree and board support code can override it on buggy hw.
+* If we get passed an isu_size (multi-isu MPIC) then we use that
+* as a default instead of the value read from the HW.
+*/
+   last_irq = (greg_feature & MPIC_GREG_FEATURE_LAST_SRC_MASK)
+   >> MPIC_GREG_FEATURE_LAST_SRC_SHIFT;
+   if (isu_size)
+   last_irq = isu_size  * MPIC_MAX_ISU - 1;
+   of_property_read_u32(mpic->node, "last-interrupt-source", _irq);
+   if (irq_count)
+   last_irq = irq_count - 1;
+
+   return last_irq;
+}
+
 struct mpic * __init mpic_alloc(struct device_node *node,
phys_addr_t phys_addr,
unsigned int flags,
@@ -1451,25 +1481,7 @@ struct mpic * __init mpic_alloc(struct device_node *node,
 0x1000);
}
 
-   /*
-* Read feature register.  For non-ISU MPICs, num sources as well. On
-* ISU MPICs, sources are counted as ISUs are added
-*/
-   greg_feature = mpic_read(mpic->gregs, MPIC_INFO(GREG_FEATURE_0));
-
-   /*
-* By default, the last source number comes from the MPIC, but the
-* device-tree and board support code can override it on buggy hw.
-* If we get passed an isu_size (multi-isu MPIC) then we use that
-* as a default instead of the value read from the HW.
-*/
-   last_irq = (greg_feature & MPIC_GREG_FEATURE_LAST_SRC_MASK)
-   >> MPIC_GREG_FEATURE_LAST_SRC_SHIFT;
-   if (isu_size)
-   last_irq = isu_size  * MPIC_MAX_ISU - 1;
-   of_property_read_u32(mpic->node, "last-interrupt-source", _irq);
-   if (irq_count)
-   last_irq = irq_count - 1;
+   last_irq = mpic_get_last_irq_source(mpic, irq_count, isu_size);
 
/* Initialize main ISU if none provided */
if (!isu_size) {
@@ -1495,6 +1507,8 @@ struct mpic * __init mpic_alloc(struct device_node *node,
if (mpic->irqhost == NULL)
return NULL;
 
+   greg_feature = mpic_read(mpic->gregs, MPIC_INFO(GREG_FEATURE_0));
+
/* Display version */
switch (greg_feature & MPIC_GREG_FEATURE_VERSION_MASK) {
case 1:
-- 
1.9.3

[RFC 0/5] powerpc/mpic: Add non-contiguous interrupt sources

2018-07-27 Thread Bharat Bhushan

Freescale MPIC h/w may not support all interrupt sources reported
by hardware or "last-interrupt-source" or platform. On these platforms
a misconfigured device tree that assigns one of the reserved
interrupts leaves a non-functioning system without warning.
 
First Patch just moves the last-irq calculation logic to a function,
Second patch reworks same logic, While I feel that device-tree should
 get precedence over platform provided last-irq, but in this series
 I have not changed this logic.
Third and fourth patch add non-contiguous interrupt sources support
Fifth patch enables this for P2020RDB-PC for now.

Bharat Bhushan (5):
  powerpc/mpic: move last irq logic to function
  powerpc/mpic: Rework last source irq calculation logic
  powerpc/mpic: Add support for non-contiguous irq ranges
  powerpc/mpic: Boot print supported interrupt ranges
  powerpc/fsl: Add supported-irq-ranges for P2020

 .../devicetree/bindings/powerpc/fsl/mpic.txt   |   8 +
 arch/powerpc/boot/dts/fsl/p2020si-post.dtsi|   3 +
 arch/powerpc/include/asm/mpic.h|   9 +
 arch/powerpc/platforms/85xx/mpc85xx_rdb.c  |   5 +
 arch/powerpc/sysdev/mpic.c | 184 ++---
 5 files changed, 182 insertions(+), 27 deletions(-)

-- 
1.9.3

[PATCH] powerpc/e200: Skip tlb1 entries used for kernel mapping

2018-07-24 Thread Bharat Bhushan

E200 have TLB1 only and it does not have TLB0.
So TLB1 are used for mapping kernel and user-space both.
TLB miss handler for E200 does not consider skipping TLBs
used for kernel mapping. This patch ensures that we skip
tlb1 entries used for kernel mapping (tlbcam_index).

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/kernel/head_fsl_booke.S | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/head_fsl_booke.S 
b/arch/powerpc/kernel/head_fsl_booke.S
index bf4c602..951fb96 100644
--- a/arch/powerpc/kernel/head_fsl_booke.S
+++ b/arch/powerpc/kernel/head_fsl_booke.S
@@ -801,12 +801,28 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_BIG_PHYS)
/* Round robin TLB1 entries assignment */
mfspr   r12, SPRN_MAS0
 
+   /* Get first free tlbcam entry */
+   lis r11, tlbcam_index@ha
+   lwz r11, tlbcam_index@l(r11)
+
+   /* Extract MAS0(NV) */
+   andi.   r13, r12, 0xfff
+   cmpw0, r13, r11
+   blt 0, 5f
+   b   6f
+5:
+   /* When NV is less than first free tlbcam entry, use first free
+* tlbcam entry for ESEL and set NV */
+   rlwimi  r12, r11, 16, 4, 15
+   addir11, r11, 1
+   rlwimi  r12, r11, 0, 20, 31
+   b   7f
+6:
/* Extract TLB1CFG(NENTRY) */
mfspr   r11, SPRN_TLB1CFG
andi.   r11, r11, 0xfff
 
-   /* Extract MAS0(NV) */
-   andi.   r13, r12, 0xfff
+   /* Set MAS0(NV) for next TLB miss exception */
addir13, r13, 1
cmpw0, r13, r11
addir12, r12, 1
-- 
1.9.3

RE: [PATCH] powerpc/mpic: Cleanup irq vector accounting

2018-07-05 Thread Bharat Bhushan




> -Original Message-
> From: Michael Ellerman [mailto:m...@ellerman.id.au]
> Sent: Wednesday, July 4, 2018 6:57 PM
> To: Bharat Bhushan ;
> b...@kernel.crashing.org; pau...@samba.org; r...@kernel.org;
> ge...@infradead.org; tyr...@linux.vnet.ibm.com; linuxppc-
> d...@lists.ozlabs.org; linux-ker...@vger.kernel.org
> Cc: Bharat Bhushan 
> Subject: Re: [PATCH] powerpc/mpic: Cleanup irq vector accounting
> 
> Bharat Bhushan  writes:
> 
> > Available vector space accounts ipis and timer interrupts while
> > spurious vector was not accounted.
> 
> OK. What is the symptom of that? Nothing? Total system crash?
> 
> Looks like this can be tagged:
> 
> Fixes: 0a4081641d72 ("powerpc/mpic: FSL MPIC error interrupt support.")
> 
> Which added the code that uses "12".
> 
> > Also later
> > mpic_setup_error_int() escape one more vector, seemingly it assumes
> > one spurious vector.
> 
> Ah right, I get it now.
> 
> So there is no bug. It's just a disagreement about whether the "intvec"
> argument to mpic_setup_error_int() indicates the first number that's free to
> use or the last number that has been allocated.
> 
> Right?

Yes, it is not any bug fix. This is minor cleanup where passing rather than 
passing "last intvec used" to "intvec to be used" in mpic_setup_error_int().

Thanks
-Bharat

> 
> cheers
> 
> > Signed-off-by: Bharat Bhushan 
> > ---
> >  arch/powerpc/sysdev/fsl_mpic_err.c | 2 +-
> >  arch/powerpc/sysdev/mpic.c | 6 +++---
> >  2 files changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/powerpc/sysdev/fsl_mpic_err.c
> > b/arch/powerpc/sysdev/fsl_mpic_err.c
> > index 488ec45..2a98837 100644
> > --- a/arch/powerpc/sysdev/fsl_mpic_err.c
> > +++ b/arch/powerpc/sysdev/fsl_mpic_err.c
> > @@ -76,7 +76,7 @@ int mpic_setup_error_int(struct mpic *mpic, int
> intvec)
> > mpic->flags |= MPIC_FSL_HAS_EIMR;
> > /* allocate interrupt vectors for error interrupts */
> > for (i = MPIC_MAX_ERR - 1; i >= 0; i--)
> > -   mpic->err_int_vecs[i] = --intvec;
> > +   mpic->err_int_vecs[i] = intvec--;
> >
> > return 0;
> >  }
> > diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
> > index 1d4e0ef6..e098d1e 100644
> > --- a/arch/powerpc/sysdev/mpic.c
> > +++ b/arch/powerpc/sysdev/mpic.c
> > @@ -1380,12 +1380,12 @@ struct mpic * __init mpic_alloc(struct
> device_node *node,
> >  * global vector number space, as in case of ipis
> >  * and timer interrupts.
> >  *
> > -* Available vector space = intvec_top - 12, where 12
> > +* Available vector space = intvec_top - 13, where 13
> >  * is the number of vectors which have been consumed by
> > -* ipis and timer interrupts.
> > +* ipis, timer interrupts and spurious.
> >  */
> > if (fsl_version >= 0x401) {
> > -   ret = mpic_setup_error_int(mpic, intvec_top - 12);
> > +   ret = mpic_setup_error_int(mpic, intvec_top - 13);
> > if (ret)
> > return NULL;
> > }
> > --
> > 1.9.3

[PATCH] powerpc/mpic: Cleanup irq vector accounting

2018-06-29 Thread Bharat Bhushan

Available vector space accounts ipis and timer interrupts
while spurious vector was not accounted. Also later
mpic_setup_error_int() escape one more vector, seemingly it
assumes one spurious vector.

Signed-off-by: Bharat Bhushan 
---
 arch/powerpc/sysdev/fsl_mpic_err.c | 2 +-
 arch/powerpc/sysdev/mpic.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_mpic_err.c 
b/arch/powerpc/sysdev/fsl_mpic_err.c
index 488ec45..2a98837 100644
--- a/arch/powerpc/sysdev/fsl_mpic_err.c
+++ b/arch/powerpc/sysdev/fsl_mpic_err.c
@@ -76,7 +76,7 @@ int mpic_setup_error_int(struct mpic *mpic, int intvec)
mpic->flags |= MPIC_FSL_HAS_EIMR;
/* allocate interrupt vectors for error interrupts */
for (i = MPIC_MAX_ERR - 1; i >= 0; i--)
-   mpic->err_int_vecs[i] = --intvec;
+   mpic->err_int_vecs[i] = intvec--;
 
return 0;
 }
diff --git a/arch/powerpc/sysdev/mpic.c b/arch/powerpc/sysdev/mpic.c
index 1d4e0ef6..e098d1e 100644
--- a/arch/powerpc/sysdev/mpic.c
+++ b/arch/powerpc/sysdev/mpic.c
@@ -1380,12 +1380,12 @@ struct mpic * __init mpic_alloc(struct device_node 
*node,
 * global vector number space, as in case of ipis
 * and timer interrupts.
 *
-* Available vector space = intvec_top - 12, where 12
+* Available vector space = intvec_top - 13, where 13
 * is the number of vectors which have been consumed by
-* ipis and timer interrupts.
+* ipis, timer interrupts and spurious.
 */
if (fsl_version >= 0x401) {
-   ret = mpic_setup_error_int(mpic, intvec_top - 12);
+   ret = mpic_setup_error_int(mpic, intvec_top - 13);
if (ret)
return NULL;
}
-- 
1.9.3

RE: [PATCH v2 3/3] powerpc/fsl: Implement cpu_show_spectre_v1/v2 for NXP PowerPC Book3E

2018-06-11 Thread Bharat Bhushan

Hi Diana,

> -Original Message-
> From: Diana Craciun [mailto:diana.crac...@nxp.com]
> Sent: Monday, June 11, 2018 6:23 PM
> To: linuxppc-dev@lists.ozlabs.org
> Cc: m...@ellerman.id.au; o...@buserror.net; Leo Li ;
> Bharat Bhushan ; Diana Madalina Craciun
> 
> Subject: [PATCH v2 3/3] powerpc/fsl: Implement cpu_show_spectre_v1/v2 for
> NXP PowerPC Book3E

Please add some description

> 
> Signed-off-by: Diana Craciun 
> ---
>  arch/powerpc/Kconfig   |  2 +-
>  arch/powerpc/kernel/security.c | 15 +++
>  2 files changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index
> 940c955..a781d60 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -170,7 +170,7 @@ config PPC
>   select GENERIC_CLOCKEVENTS_BROADCASTif SMP
>   select GENERIC_CMOS_UPDATE
>   select GENERIC_CPU_AUTOPROBE
> - select GENERIC_CPU_VULNERABILITIES  if PPC_BOOK3S_64
> + select GENERIC_CPU_VULNERABILITIES  if PPC_BOOK3S_64 ||
> PPC_FSL_BOOK3E
>   select GENERIC_IRQ_SHOW
>   select GENERIC_IRQ_SHOW_LEVEL
>   select GENERIC_SMP_IDLE_THREAD
> diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c
> index 797c975..aceaadc 100644
> --- a/arch/powerpc/kernel/security.c
> +++ b/arch/powerpc/kernel/security.c
> @@ -183,3 +183,18 @@ ssize_t cpu_show_spectre_v2(struct device *dev,
> struct device_attribute *attr, c  }  #endif /* CONFIG_PPC_BOOK3S_64 */
> 
> +#ifdef CONFIG_PPC_FSL_BOOK3E
> +ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute
> +*attr, char *buf) {
> + if (barrier_nospec_enabled)
> + return sprintf(buf, "Mitigation: __user pointer 
> sanitization\n");
> +
> + return sprintf(buf, "Vulnerable\n");
> +}
> +
> +ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute
> +*attr, char *buf) {
> + return sprintf(buf, "Vulnerable\n");
> +}
> +#endif /* CONFIG_PPC_FSL_BOOK3E */
> +
> --
> 2.5.5

RE: [PATCH 2/6 v2] iommu: of: make of_pci_map_rid() available for other devices too

2018-04-18 Thread Bharat Bhushan



> -Original Message-
> From: Robin Murphy [mailto:robin.mur...@arm.com]
> Sent: Tuesday, April 17, 2018 10:23 PM
> To: Nipun Gupta <nipun.gu...@nxp.com>; robh...@kernel.org;
> frowand.l...@gmail.com
> Cc: will.dea...@arm.com; mark.rutl...@arm.com; catalin.mari...@arm.com;
> h...@lst.de; gre...@linuxfoundation.org; j...@8bytes.org;
> m.szyprow...@samsung.com; shawn...@kernel.org; bhelg...@google.com;
> io...@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> devicet...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; linuxppc-
> d...@lists.ozlabs.org; linux-...@vger.kernel.org; Bharat Bhushan
> <bharat.bhus...@nxp.com>; stuyo...@gmail.com; Laurentiu Tudor
> <laurentiu.tu...@nxp.com>; Leo Li <leoyang...@nxp.com>
> Subject: Re: [PATCH 2/6 v2] iommu: of: make of_pci_map_rid() available for
> other devices too
> 
> On 17/04/18 11:21, Nipun Gupta wrote:
> > iommu-map property is also used by devices with fsl-mc. This patch
> > moves the of_pci_map_rid to generic location, so that it can be used
> > by other busses too.

Nipun, please clarify that only function name is changed and rest of body is 
same.

> >
> > Signed-off-by: Nipun Gupta <nipun.gu...@nxp.com>
> > ---
> >   drivers/iommu/of_iommu.c | 106
> > +--
> 
> Doesn't this break "msi-parent" parsing for !CONFIG_OF_IOMMU?

Yes, this will be a problem with MSI 

> I guess you
> don't want fsl-mc to have to depend on PCI, but this looks like a step in the
> wrong direction.
> 
> I'm not entirely sure where of_map_rid() fits best, but from a quick look 
> around
> the least-worst option might be drivers/of/of_address.c, unless Rob and Frank
> have a better idea of where generic DT-based ID translation routines could 
> live?

drivers/of/address.c may be proper place to move until someone have better idea.

Thanks
-Bharat

> 
> >   drivers/of/irq.c |   6 +--
> >   drivers/pci/of.c | 101 
> > 
> >   include/linux/of_iommu.h |  11 +
> >   include/linux/of_pci.h   |  10 -
> >   5 files changed, 117 insertions(+), 117 deletions(-)
> >
> > diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c index
> > 5c36a8b..4e7712f 100644
> > --- a/drivers/iommu/of_iommu.c
> > +++ b/drivers/iommu/of_iommu.c
> > @@ -138,6 +138,106 @@ static int of_iommu_xlate(struct device *dev,
> > return ops->of_xlate(dev, iommu_spec);
> >   }
> >
> > +/**
> > + * of_map_rid - Translate a requester ID through a downstream mapping.
> > + * @np: root complex device node.
> > + * @rid: device requester ID to map.
> > + * @map_name: property name of the map to use.
> > + * @map_mask_name: optional property name of the mask to use.
> > + * @target: optional pointer to a target device node.
> > + * @id_out: optional pointer to receive the translated ID.
> > + *
> > + * Given a device requester ID, look up the appropriate
> > +implementation-defined
> > + * platform ID and/or the target device which receives transactions
> > +on that
> > + * ID, as per the "iommu-map" and "msi-map" bindings. Either of
> > +@target or
> > + * @id_out may be NULL if only the other is required. If @target
> > +points to
> > + * a non-NULL device node pointer, only entries targeting that node
> > +will be
> > + * matched; if it points to a NULL value, it will receive the device
> > +node of
> > + * the first matching target phandle, with a reference held.
> > + *
> > + * Return: 0 on success or a standard error code on failure.
> > + */
> > +int of_map_rid(struct device_node *np, u32 rid,
> > +  const char *map_name, const char *map_mask_name,
> > +  struct device_node **target, u32 *id_out) {
> > +   u32 map_mask, masked_rid;
> > +   int map_len;
> > +   const __be32 *map = NULL;
> > +
> > +   if (!np || !map_name || (!target && !id_out))
> > +   return -EINVAL;
> > +
> > +   map = of_get_property(np, map_name, _len);
> > +   if (!map) {
> > +   if (target)
> > +   return -ENODEV;
> > +   /* Otherwise, no map implies no translation */
> > +   *id_out = rid;
> > +   return 0;
> > +   }
> > +
> > +   if (!map_len || map_len % (4 * sizeof(*map))) {
> > +   pr_err("%pOF: Error: Bad %s length: %d\n", np,
> > +   map_name, map_len);
> > +   return -EINVAL;
> >

[PATCH 3/4 RFC] fsl/msi: Add MSI bank allocation for kernel owned devices

2015-03-02 Thread Bharat Bhushan

With this patch a context can allocate a MSI bank and use the
allocated MSI-bank for the devices in that context.

kernel/host context is NULL, So all devices owned by kernel
will share a MSI bank allocated with context = NULL.

This patch is in direction to have separate MSI bank for kernel
context and userspace/VM context. We do not want two software
context (kernel and VMs) to share a MSI bank for safe/reliable
interrupts with full isolation. Follow up patch will add interface
to allocate a MSI bank for userspace/VM context.

NOTE: This RFC patch allows only one MSI bank to be allocated for
kernel context. Which seems to be sufficient to me. But if we see this
is limiting some real usecase scanerio then this limitation can be
removed

One issue which still need to addressed is when to free kernel
context allocated MSI bank? Say all MSI capable devices are assigned
to VM/userspace then there is no need to have any MSI bank reserved
for kernel context.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/sysdev/fsl_msi.c | 88 ++-
 arch/powerpc/sysdev/fsl_msi.h |  4 ++
 2 files changed, 83 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index 32ba1e3..027aeeb 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -142,6 +142,79 @@ static void fsl_teardown_msi_irqs(struct pci_dev *pdev)
return;
 }
 
+/*
+ * Allocate a MSI Bank for the requested context.
+ * NULL context means that this request is to allocate
+ * MSI bank for kernel owned devices. And currently we
+ * assume that one MSI bank is sufficient for kernel.
+ */
+static struct fsl_msi *fsl_msi_allocate_msi_bank(void *context)
+{
+   struct fsl_msi *msi_data;
+
+   /* Kernel context (NULL) can reserve only one msi bank */
+   if (!context) {
+   list_for_each_entry(msi_data, msi_head, list) {
+   if ((msi_data-reserved == MSI_RESERVED) 
+   (msi_data-context == NULL))
+   return NULL;
+   }
+   }
+
+   list_for_each_entry(msi_data, msi_head, list) {
+   if (msi_data-reserved == MSI_FREE) {
+   msi_data-reserved = MSI_RESERVED;
+   msi_data-context = context;
+   return msi_data;
+   }
+   }
+
+   return NULL;
+}
+
+/* FIXME: Assumption that host kernel will allocate only one MSI bank */
+ __attribute__ ((unused)) static int fsl_msi_free_msi_bank(void *context)
+{
+   struct fsl_msi *msi_data;
+
+   list_for_each_entry(msi_data, msi_head, list) {
+   if ((msi_data-reserved == MSI_RESERVED) 
+(msi_data-context == context)) {
+   msi_data-reserved = MSI_FREE;
+   msi_data-context = NULL;
+   return 0;
+   }
+   }
+   return -ENODEV;
+}
+
+/*  This API returns the allocated MSI bank of context
+ *  to which pdev device belongs.
+ *  All kernel owned devices have NULL context. All devices
+ *  in same context will share the allocated MSI bank.
+ *
+ *  Note: If no MSI bank allocated to kernel context then
+ *  we allocate a MSI bank here.
+ */
+static struct fsl_msi *fsl_msi_get_reserved_msi_bank(struct pci_dev *pdev)
+{
+   struct fsl_msi *msi_data = NULL;
+   void *context = NULL;
+
+   list_for_each_entry(msi_data, msi_head, list) {
+   if ((msi_data-reserved == MSI_RESERVED) 
+   (msi_data-context == context))
+   return msi_data;
+   }
+
+   /* If no MSI bank allocated for kernel owned device, allocate one */
+   msi_data = fsl_msi_allocate_msi_bank(NULL);
+   if (msi_data)
+   return msi_data;
+
+   return NULL;
+}
+
 static void fsl_compose_msi_msg(struct pci_dev *pdev, int hwirq,
struct msi_msg *msg,
struct fsl_msi *fsl_msi_data)
@@ -174,7 +247,7 @@ static int fsl_setup_msi_irqs(struct pci_dev *pdev, int 
nvec, int type)
struct pci_controller *hose = pci_bus_to_host(pdev-bus);
struct device_node *np;
phandle phandle = 0;
-   int rc, hwirq = -ENOMEM;
+   int rc = -ENODEV, hwirq = -ENOMEM;
unsigned int virq;
struct msi_desc *entry;
struct msi_msg msg;
@@ -231,15 +304,12 @@ static int fsl_setup_msi_irqs(struct pci_dev *pdev, int 
nvec, int type)
if (specific_msi_bank) {
hwirq = msi_bitmap_alloc_hwirqs(msi_data-bitmap, 1);
} else {
-   /*
-* Loop over all the MSI devices until we find one that 
has an
-* available interrupt.
-*/
-   list_for_each_entry(msi_data, msi_head, list

[PATCH 4/4 RFC] fsl/msi: Add interface to reserve/free msi bank

2015-03-02 Thread Bharat Bhushan

This patch allows a context (different from kernel context)
to reserve a MSI bank for itself. And then the devices in the
context will share the MSI bank.

VFIO meta driver is one of typical user of these APIs. It will
reserve a MSI bank for MSI interrupt support of direct assignment
PCI devices to a Guest. Patches for same will follow this patch.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/include/asm/device.h  |   2 +
 arch/powerpc/include/asm/fsl_msi.h |  26 ++
 arch/powerpc/sysdev/fsl_msi.c  | 169 +++--
 arch/powerpc/sysdev/fsl_msi.h  |   1 +
 4 files changed, 173 insertions(+), 25 deletions(-)
 create mode 100644 arch/powerpc/include/asm/fsl_msi.h

diff --git a/arch/powerpc/include/asm/device.h 
b/arch/powerpc/include/asm/device.h
index 38faede..1c2bfd7 100644
--- a/arch/powerpc/include/asm/device.h
+++ b/arch/powerpc/include/asm/device.h
@@ -40,6 +40,8 @@ struct dev_archdata {
 #ifdef CONFIG_FAIL_IOMMU
int fail_iommu;
 #endif
+
+   void *context;
 };
 
 struct pdev_archdata {
diff --git a/arch/powerpc/include/asm/fsl_msi.h 
b/arch/powerpc/include/asm/fsl_msi.h
new file mode 100644
index 000..e9041c2
--- /dev/null
+++ b/arch/powerpc/include/asm/fsl_msi.h
@@ -0,0 +1,26 @@
+/*
+ * Copyright (C) 2014 Freescale Semiconductor, Inc. All rights reserved.
+ *
+ * Author: Bharat Bhushan bharat.bhus...@freescale.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2 of the
+ * License.
+ *
+ */
+
+#ifndef _POWERPC_FSL_MSI_H
+#define _POWERPC_FSL_MSI_H
+
+extern int fsl_msi_set_msi_bank_region(struct iommu_domain *domain,
+  void *context, int win,
+  dma_addr_t iova, int prot);
+extern int fsl_msi_clear_msi_bank_region(struct iommu_domain *domain,
+struct iommu_group *iommu_group,
+int win, dma_addr_t iova);
+extern struct fsl_msi *fsl_msi_reserve_msi_bank(void *context);
+extern int fsl_msi_unreserve_msi_bank(void *context);
+extern int fsl_msi_set_msi_bank_in_dev(struct device *dev, void *data);
+
+#endif /* _POWERPC_FSL_MSI_H */
diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index 027aeeb..75cd196 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -25,6 +25,7 @@
 #include asm/ppc-pci.h
 #include asm/mpic.h
 #include asm/fsl_hcalls.h
+#include linux/iommu.h
 
 #include fsl_msi.h
 #include fsl_pci.h
@@ -172,22 +173,6 @@ static struct fsl_msi *fsl_msi_allocate_msi_bank(void 
*context)
return NULL;
 }
 
-/* FIXME: Assumption that host kernel will allocate only one MSI bank */
- __attribute__ ((unused)) static int fsl_msi_free_msi_bank(void *context)
-{
-   struct fsl_msi *msi_data;
-
-   list_for_each_entry(msi_data, msi_head, list) {
-   if ((msi_data-reserved == MSI_RESERVED) 
-(msi_data-context == context)) {
-   msi_data-reserved = MSI_FREE;
-   msi_data-context = NULL;
-   return 0;
-   }
-   }
-   return -ENODEV;
-}
-
 /*  This API returns the allocated MSI bank of context
  *  to which pdev device belongs.
  *  All kernel owned devices have NULL context. All devices
@@ -200,6 +185,12 @@ static struct fsl_msi 
*fsl_msi_get_reserved_msi_bank(struct pci_dev *pdev)
 {
struct fsl_msi *msi_data = NULL;
void *context = NULL;
+   struct device *dev = pdev-dev;
+
+   /* Device assigned to userspace if there is valid context */
+   if (dev-archdata.context) {
+   context = dev-archdata.context;
+   }
 
list_for_each_entry(msi_data, msi_head, list) {
if ((msi_data-reserved == MSI_RESERVED) 
@@ -208,13 +199,133 @@ static struct fsl_msi 
*fsl_msi_get_reserved_msi_bank(struct pci_dev *pdev)
}
 
/* If no MSI bank allocated for kernel owned device, allocate one */
-   msi_data = fsl_msi_allocate_msi_bank(NULL);
-   if (msi_data)
-   return msi_data;
+   if (!context) {
+   msi_data = fsl_msi_allocate_msi_bank(NULL);
+   if (msi_data)
+   return msi_data;
+   }
 
return NULL;
 }
 
+/* API to set context to which the device belongs */
+int fsl_msi_set_msi_bank_in_dev(struct device *dev, void *data)
+{
+   dev-archdata.context = data;
+   return 0;
+}
+
+/*  This API Allows a MSI bank to be reserved for a context.
+ *  All devices in same context will share the allocated
+ *  MSI bank.
+ *  Typically this function will be called from meta
+ *  driver like VFIO with a valid context.
+ */
+struct fsl_msi *fsl_msi_reserve_msi_bank(void *context)
+{
+   struct fsl_msi *msi_data

[PATCH 2/4 RFC] fsl/msi: Move fsl, msi mode specific MSI device search out of main loop

2015-03-02 Thread Bharat Bhushan

Moving out the specific MSI device search out of main loop. And now
the specific msi device search is placed with other fsl.msi specific
code in same function.
This is in preparation to MSI bank partitioning.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/sysdev/fsl_msi.c | 39 +--
 1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index ec3161b..32ba1e3 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -178,7 +178,8 @@ static int fsl_setup_msi_irqs(struct pci_dev *pdev, int 
nvec, int type)
unsigned int virq;
struct msi_desc *entry;
struct msi_msg msg;
-   struct fsl_msi *msi_data;
+   struct fsl_msi *msi_data = NULL;
+   bool specific_msi_bank = false;
 
if (type == PCI_CAP_ID_MSIX)
pr_debug(fslmsi: MSI-X untested, trying anyway.\n);
@@ -199,12 +200,9 @@ static int fsl_setup_msi_irqs(struct pci_dev *pdev, int 
nvec, int type)
hose-dn-full_name, np-phandle);
return -EINVAL;
}
-   }
-
-   list_for_each_entry(entry, pdev-msi_list, list) {
/*
-* Loop over all the MSI devices until we find one that has an
-* available interrupt.
+* Loop over all the MSI devices till we find
+* specific MSI device.
 */
list_for_each_entry(msi_data, msi_head, list) {
/*
@@ -215,12 +213,33 @@ static int fsl_setup_msi_irqs(struct pci_dev *pdev, int 
nvec, int type)
 * has the additional benefit of skipping over MSI
 * nodes that are not mapped in the PAMU.
 */
-   if (phandle  (phandle != msi_data-phandle))
-   continue;
+   if (phandle == msi_data-phandle) {
+   specific_msi_bank = true;
+   break;
+   }
+   }
 
+   if (!specific_msi_bank) {
+   dev_err(pdev-dev,
+   No specific MSI device found for node %s\n,
+   hose-dn-full_name);
+   return -EINVAL;
+   }
+   }
+
+   list_for_each_entry(entry, pdev-msi_list, list) {
+   if (specific_msi_bank) {
hwirq = msi_bitmap_alloc_hwirqs(msi_data-bitmap, 1);
-   if (hwirq = 0)
-   break;
+   } else {
+   /*
+* Loop over all the MSI devices until we find one that 
has an
+* available interrupt.
+*/
+   list_for_each_entry(msi_data, msi_head, list) {
+   hwirq = 
msi_bitmap_alloc_hwirqs(msi_data-bitmap, 1);
+   if (hwirq = 0)
+   break;
+   }
}
 
if (hwirq  0) {
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/4 RFC] fsl/msi: have msiir register address absolute rather than offset

2015-03-02 Thread Bharat Bhushan

Having absolute address simplifies the code and also removes the
confusion around feature-msiir_offset and msi_data-msiir_offset.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/sysdev/fsl_msi.c | 9 +++--
 arch/powerpc/sysdev/fsl_msi.h | 2 +-
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index 4bbb4b8..ec3161b 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -157,7 +157,7 @@ static void fsl_compose_msi_msg(struct pci_dev *pdev, int 
hwirq,
if (reg  (len == sizeof(u64)))
address = be64_to_cpup(reg);
else
-   address = fsl_pci_immrbar_base(hose) + msi_data-msiir_offset;
+   address = msi_data-msiir;
 
msg-address_lo = lower_32_bits(address);
msg-address_hi = upper_32_bits(address);
@@ -430,18 +430,15 @@ static int fsl_of_msi_probe(struct platform_device *dev)
dev-dev.of_node-full_name);
goto error_out;
}
-   msi-msiir_offset =
-   features-msiir_offset + (res.start  0xf);
 
/*
 * First read the MSIIR/MSIIR1 offset from dts
 * On failure use the hardcode MSIIR offset
 */
if (of_address_to_resource(dev-dev.of_node, 1, msiir))
-   msi-msiir_offset = features-msiir_offset +
-   (res.start  MSIIR_OFFSET_MASK);
+   msi-msiir = res.start + features-msiir_offset;
else
-   msi-msiir_offset = msiir.start  MSIIR_OFFSET_MASK;
+   msi-msiir = msiir.start;
}
 
msi-feature = features-fsl_pic_ip;
diff --git a/arch/powerpc/sysdev/fsl_msi.h b/arch/powerpc/sysdev/fsl_msi.h
index 420cfcb..9b0ab84 100644
--- a/arch/powerpc/sysdev/fsl_msi.h
+++ b/arch/powerpc/sysdev/fsl_msi.h
@@ -34,7 +34,7 @@ struct fsl_msi {
 
unsigned long cascade_irq;
 
-   u32 msiir_offset; /* Offset of MSIIR, relative to start of CCSR */
+   phys_addr_t msiir; /* MSIIR Address in CCSR */
u32 ibs_shift; /* Shift of interrupt bit select */
u32 srs_shift; /* Shift of the shared interrupt register select */
void __iomem *msi_regs;
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/4 RFC] fsl/msi: Add support for MSI bank partitioning

2015-03-02 Thread Bharat Bhushan

With this patchset we add MSI bank partitioning support. MSI bank
partitioning is required for supporting direct device assignment
of MSI capable PCI devices. One MSI bank will be allocated for
kernel context. VFIO can allocate one MSI bank per context.
And all devices in the context will share the MSI bank.

We have limited number of MSI banks (2-4). So to support large
number of context we need to allow sharing of MSI banks. This
patchset does not support sharing of MSI bank but will be done
soon once this patchset take a shape.

These changes are tested with both kernel owned PCI devices and
direct assigned devices using VFIO to guest.

Bharat Bhushan (4):
  fsl/msi: have msiir register address absolute rather than offset
  fsl/msi: Move fsl,msi mode specific MSI device search out of main loop
  fsl/msi: Add MSI bank allocation for kernel owned devices
  fsl/msi: Add interface to reserve/free msi bank

 arch/powerpc/include/asm/device.h  |   2 +
 arch/powerpc/include/asm/fsl_msi.h |  26 
 arch/powerpc/sysdev/fsl_msi.c  | 249 +
 arch/powerpc/sysdev/fsl_msi.h  |   7 +-
 4 files changed, 261 insertions(+), 23 deletions(-)
 create mode 100644 arch/powerpc/include/asm/fsl_msi.h

-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] booke/powerpc: define wimge shift mask to fix compilation error

2014-05-13 Thread Bharat Bhushan

This fixes below compilation error on SOCs where CONFIG_PHYS_64BIT
is not defined:

 arch/powerpc/kvm/e500_mmu_host.c: In function 'kvmppc_e500_shadow_map':
| arch/powerpc/kvm/e500_mmu_host.c:631:20: error: 'PTE_WIMGE_SHIFT' undeclared 
(first use in this function)
|wimg = (*ptep  PTE_WIMGE_SHIFT)  MAS2_WIMGE_MASK;
| ^
| arch/powerpc/kvm/e500_mmu_host.c:631:20: note: each undeclared identifier is 
reported only once for each function it appears in
| make[1]: *** [arch/powerpc/kvm/e500_mmu_host.o] Error 1

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/include/asm/pte-fsl-booke.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pte-fsl-booke.h 
b/arch/powerpc/include/asm/pte-fsl-booke.h
index 2c12be5..e84dd7e 100644
--- a/arch/powerpc/include/asm/pte-fsl-booke.h
+++ b/arch/powerpc/include/asm/pte-fsl-booke.h
@@ -37,5 +37,7 @@
 #define _PMD_PRESENT_MASK (PAGE_MASK)
 #define _PMD_BAD   (~PAGE_MASK)
 
+#define PTE_WIMGE_SHIFT (6)
+
 #endif /* __KERNEL__ */
 #endif /*  _ASM_POWERPC_PTE_FSL_BOOKE_H */
-- 
1.7.0.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] rtc: ds3232 make it possible to share an irq

2014-01-24 Thread Bharat Bhushan

It's possible to have RTC irq shared with other device (e.g.
t4240qds board shares ds3232irq with phy one).  Handle this in
driver.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 drivers/rtc/rtc-ds3232.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/rtc/rtc-ds3232.c b/drivers/rtc/rtc-ds3232.c
index b83bb5a..598837b 100644
--- a/drivers/rtc/rtc-ds3232.c
+++ b/drivers/rtc/rtc-ds3232.c
@@ -419,8 +419,8 @@ static int ds3232_probe(struct i2c_client *client,
}
 
if (client-irq = 0) {
-   ret = devm_request_irq(client-dev, client-irq, ds3232_irq, 0,
-ds3232, client);
+   ret = devm_request_irq(client-dev, client-irq, ds3232_irq,
+  IRQF_SHARED, ds3232, client);
if (ret) {
dev_err(client-dev, unable to request IRQ\n);
return ret;
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: Error in frreing hugepages with preemption enabled

2013-12-05 Thread Bharat Bhushan

 -Original Message-
 From: Andrea Arcangeli [mailto:aarca...@redhat.com]
 Sent: Wednesday, December 04, 2013 3:51 AM
 To: Alexander Graf
 Cc: Bhushan Bharat-R65777; linuxppc-dev@lists.ozlabs.org; kvm-
 p...@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Ben 
 Herrenschmidt
 Subject: Re: Error in frreing hugepages with preemption enabled

 Hi everyone,

 On Fri, Nov 29, 2013 at 12:13:03PM +0100, Alexander Graf wrote:

  On 29.11.2013, at 05:38, Bharat Bhushan bharat.bhus...@freescale.com 
  wrote:

   Hi Alex,

   I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. 
   With
 allocated pages things seems to work fine but I uses hugepages for guest I see
 below prints when quit from qemu.

   (qemu) QEMU waiting for connection on: telnet:0.0.0.0:,server
   qemu-system-ppc64: pci_add_option_rom: failed to find romfile efi-
 virtio.rom
   q
   debug_smp_processor_id: 15 callbacks suppressed
   BUG: using smp_processor_id() in preemptible [] code:
   qemu-system-ppc/2504 caller is .free_hugepd_range+0xb0/0x21c
   CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted
   3.12.0-rc3-07733-gabf4907 #175 Call Trace:
   [c000fb433400] [c0007d38] .show_stack+0x7c/0x1cc
   (unreliable) [c000fb4334d0] [c05e8ce0]
   .dump_stack+0x9c/0xf4 [c000fb433560] [c02de5ec]
   .debug_smp_processor_id+0x108/0x11c
   [c000fb4335f0] [c0025e10] .free_hugepd_range+0xb0/0x21c
   [c000fb433680] [c00265bc]
   .hugetlb_free_pgd_range+0x2c8/0x3b0
   [c000fb4337a0] [c00e428c] .free_pgtables+0x14c/0x158
   [c000fb433840] [c00ef320] .exit_mmap+0xec/0x194
   [c000fb433960] [c004d780] .mmput+0x64/0x124
   [c000fb4339e0] [c0051f40] .do_exit+0x29c/0x9c8
   [c000fb433ae0] [c00527c8] .do_group_exit+0x50/0xc4
   [c000fb433b70] [c00606a0]
   .get_signal_to_deliver+0x21c/0x5d8
   [c000fb433c70] [c0009b08] .do_signal+0x54/0x278
   [c000fb433db0] [c0009e50] .do_notify_resume+0x64/0x78
   [c000fb433e30] [cb44]
   .ret_from_except_lite+0x70/0x74

   This mean that free_hugepd_range() must be called with preemption enabled.

  with preemption disabled.

   I tried below change and this seems to work fine (I am not having
   expertise in this area so not sure this is correct way)

  Not sure - the scope looks odd to me. Let's ask Andrea - I'm sure he knows
 what to do :).

 :) So I had a look at the top of this function (0xb0) in the upstream kernel 
 and
 no smp_processor_id() call is apparent, is this stock git or a ppc tree? The
 first few calls seem not to call it but I may have overlooked something. It's
 just quicker if somebody with vmlinux finds the location of it.

 static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int
 pdshift,
 unsigned long start, unsigned long end,
 unsigned long floor, unsigned long ceiling) {
   pte_t *hugepte = hugepd_page(*hpdp);
   int i;

   unsigned long pdmask = ~((1UL  pdshift) - 1);
   unsigned int num_hugepd = 1;

 #ifdef CONFIG_PPC_FSL_BOOK3E
   /* Note: On fsl the hpdp may be the first of several */
   num_hugepd = (1  (hugepd_shift(*hpdp) - pdshift)); #else
   unsigned int shift = hugepd_shift(*hpdp); #endif

   start = pdmask;
   if (start  floor)
   return;
   if (ceiling) {
   ceiling = pdmask;
   if (! ceiling)
   return;
   }
   if (end - 1  ceiling - 1)
   return;

   for (i = 0; i  num_hugepd; i++, hpdp++)
   hpdp-pd = 0;

   tlb-need_flush = 1;

 #ifdef CONFIG_PPC_FSL_BOOK3E
   hugepd_free(tlb, hugepte);
 #else
   pgtable_free_tlb(tlb, hugepte, pdshift - shift); #endif }

 Generally smp_processor_id should never be used, exactly to avoid problems 
 like
 above with preempion enabled in .config.

 Instead it should be replaced with a get_cpu()/put_cpu() pair that is exactly
 meant to fix bugs like this and define proper critical sections around the 
 per-
 cpu variables.

 #define get_cpu() ({ preempt_disable(); smp_processor_id(); })
 #define put_cpu() preempt_enable()

 After you find where that smp_processor_id() is located, you should simply
 replace it with a get_cpu() and then you should insert a put_cpu immediately
 after the cpu info is not used anymore. That will define a proper and strict
 critical section around the use of the per-cpu variables.

 With a ppc vmlinux it should be immediate to find the location of
 smp_processor_id but I don't have the ppc vmlinux here.

Thanks Andrea for the details description. It is really helpful

I will look into this.

Thanks
-Bharat

 Thanks!
 Andrea

  Alex

   diff --git a/arch/powerpc/mm/hugetlbpage.c
   b/arch/powerpc/mm/hugetlbpage.c index d67db4b..6bf8459

RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

2013-12-05 Thread Bharat Bhushan

 -Original Message-
 From: Wood Scott-B07421
 Sent: Friday, December 06, 2013 5:52 AM
 To: Bhushan Bharat-R65777
 Cc: Alex Williamson; linux-...@vger.kernel.org; ag...@suse.de; Yoder Stuart-
 B08248; io...@lists.linux-foundation.org; bhelg...@google.com; linuxppc-
 d...@lists.ozlabs.org; linux-ker...@vger.kernel.org
 Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

 On Thu, 2013-11-28 at 03:19 -0600, Bharat Bhushan wrote:

   -Original Message-
   From: Bhushan Bharat-R65777
   Sent: Wednesday, November 27, 2013 9:39 PM
   To: 'Alex Williamson'
   Cc: Wood Scott-B07421; linux-...@vger.kernel.org; ag...@suse.de;
   Yoder Stuart- B08248; io...@lists.linux-foundation.org;
   bhelg...@google.com; linuxppc- d...@lists.ozlabs.org;
   linux-ker...@vger.kernel.org
   Subject: RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale
   IOMMU (PAMU)

-Original Message-
From: Alex Williamson [mailto:alex.william...@redhat.com]
Sent: Monday, November 25, 2013 10:08 PM
To: Bhushan Bharat-R65777
Cc: Wood Scott-B07421; linux-...@vger.kernel.org; ag...@suse.de;
Yoder
Stuart- B08248; io...@lists.linux-foundation.org;
bhelg...@google.com;
linuxppc- d...@lists.ozlabs.org; linux-ker...@vger.kernel.org
Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
IOMMU
(PAMU)

On Mon, 2013-11-25 at 05:33 +, Bharat Bhushan wrote:

  -Original Message-
  From: Alex Williamson [mailto:alex.william...@redhat.com]
  Sent: Friday, November 22, 2013 2:31 AM
  To: Wood Scott-B07421
  Cc: Bhushan Bharat-R65777; linux-...@vger.kernel.org;
  ag...@suse.de; Yoder Stuart-B08248;
  io...@lists.linux-foundation.org; bhelg...@google.com;
  linuxppc- d...@lists.ozlabs.org; linux-ker...@vger.kernel.org
  Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for
  Freescale IOMMU (PAMU)

  On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
   On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
On Thu, 2013-11-21 at 11:20 +, Bharat Bhushan wrote:

  -Original Message-
  From: Alex Williamson
  [mailto:alex.william...@redhat.com]
  Sent: Thursday, November 21, 2013 12:17 AM
  To: Bhushan Bharat-R65777
  Cc: j...@8bytes.org; bhelg...@google.com;
  ag...@suse.de; Wood Scott-B07421; Yoder Stuart-B08248;
  io...@lists.linux-foundation.org; linux-
  p...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org;
  linux- ker...@vger.kernel.org; Bhushan Bharat-R65777
  Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for
  Freescale IOMMU (PAMU)

  Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie.
  each vfio user has $COUNT regions at their disposal
 exclusively)?

 Number of msi-bank count is system wide and not per
 aperture, But will be
  setting windows for banks in the device aperture.
 So say if we are direct assigning 2 pci device (both
 have different iommu
  group, so 2 aperture in iommu) to VM.
 Now qemu can make only one call to know how many
 msi-banks are there but
  it must set sub-windows for all banks for both pci device in
  its respective aperture.

I'm still confused.  What I want to make sure of is that
the banks are independent per aperture.  For instance, if
we have two separate userspace processes operating
independently and they both chose to use msi bank zero for
their device, that's bank zero within each aperture and
doesn't interfere.  Or another way to ask is can a
malicious user interfere with other users by
using the wrong bank.
Thanks,

   They can interfere.

 Want to be sure of how they can interfere?

What happens if more than one user selects the same MSI bank?
Minimally, wouldn't that result in the IOMMU blocking transactions
from the previous user once the new user activates their mapping?

   Yes and no; With current implementation yes but with a minor change
   no. Later in this response I will explain how.

   With this hardware, the only way to prevent that
   is to make sure that a bank is not shared by multiple
   protection
   contexts.
   For some of our users, though, I believe preventing this is
   less important than the performance benefit.

 So should we let this patch series in without protection?

No.

  I think we need some sort of ownership model around the msi banks
 then.
  Otherwise there's nothing preventing another userspace from
  attempting an MSI based attack on other users, or perhaps even
  on the host.  VFIO can't allow that.  Thanks,

 We have very few (3 MSI bank on most of chips), so we can

RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

2013-12-05 Thread Bharat Bhushan

 -Original Message-
 From: Wood Scott-B07421
 Sent: Friday, December 06, 2013 5:31 AM
 To: Bhushan Bharat-R65777
 Cc: Alex Williamson; linux-...@vger.kernel.org; ag...@suse.de; Yoder Stuart-
 B08248; io...@lists.linux-foundation.org; bhelg...@google.com; linuxppc-
 d...@lists.ozlabs.org; linux-ker...@vger.kernel.org
 Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

 On Sun, 2013-11-24 at 23:33 -0600, Bharat Bhushan wrote:

   -Original Message-
   From: Alex Williamson [mailto:alex.william...@redhat.com]
   Sent: Friday, November 22, 2013 2:31 AM
   To: Wood Scott-B07421
   Cc: Bhushan Bharat-R65777; linux-...@vger.kernel.org; ag...@suse.de;
   Yoder Stuart-B08248; io...@lists.linux-foundation.org;
   bhelg...@google.com; linuxppc- d...@lists.ozlabs.org;
   linux-ker...@vger.kernel.org
   Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
   IOMMU (PAMU)

   On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
 On Thu, 2013-11-21 at 11:20 +, Bharat Bhushan wrote:

   -Original Message-
   From: Alex Williamson [mailto:alex.william...@redhat.com]
   Sent: Thursday, November 21, 2013 12:17 AM
   To: Bhushan Bharat-R65777
   Cc: j...@8bytes.org; bhelg...@google.com; ag...@suse.de;
   Wood Scott-B07421; Yoder Stuart-B08248;
   io...@lists.linux-foundation.org; linux-
   p...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
   ker...@vger.kernel.org; Bhushan Bharat-R65777
   Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for
   Freescale IOMMU (PAMU)

   Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each
   vfio user has $COUNT regions at their disposal exclusively)?

  Number of msi-bank count is system wide and not per aperture,
  But will be
   setting windows for banks in the device aperture.
  So say if we are direct assigning 2 pci device (both have
  different iommu
   group, so 2 aperture in iommu) to VM.
  Now qemu can make only one call to know how many msi-banks are
  there but
   it must set sub-windows for all banks for both pci device in its
   respective aperture.

 I'm still confused.  What I want to make sure of is that the
 banks are independent per aperture.  For instance, if we have
 two separate userspace processes operating independently and
 they both chose to use msi bank zero for their device, that's
 bank zero within each aperture and doesn't interfere.  Or
 another way to ask is can a malicious user interfere with other users 
 by
 using the wrong bank.
 Thanks,

They can interfere.

  Want to be sure of how they can interfere?

 If more than one VFIO user shares the same MSI group, one of the users can 
 send
 MSIs to another user, by using the wrong interrupt within the bank.  
 Unexpected
 MSIs could cause misbehavior or denial of service.

With this hardware, the only way to prevent that
is to make sure that a bank is not shared by multiple protection 
contexts.
For some of our users, though, I believe preventing this is less
important than the performance benefit.

  So should we let this patch series in without protection?

 No, there should be some sort of opt-in mechanism similar to IOMMU-less VFIO 
 --
 but not the same exact one, since one is a much more serious loss of isolation
 than the other.

Can you please elaborate opt-in mechanism?

   I think we need some sort of ownership model around the msi banks then.
   Otherwise there's nothing preventing another userspace from
   attempting an MSI based attack on other users, or perhaps even on
   the host.  VFIO can't allow that.  Thanks,

  We have very few (3 MSI bank on most of chips), so we can not assign
  one to each userspace.

 That depends on how many users there are.

What I think we can do is:
 - Reserve one MSI region for host. Host will not share MSI region with Guest.
 - For upto 2 Guest (MAX msi with host - 1) give then separate MSI sub regions
 - Additional Guest will share MSI region with other guest.

Any better suggestion are most welcome.

Thanks
-Bharat

 -Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

2013-11-28 Thread Bharat Bhushan

 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Wednesday, November 27, 2013 9:39 PM
 To: 'Alex Williamson'
 Cc: Wood Scott-B07421; linux-...@vger.kernel.org; ag...@suse.de; Yoder Stuart-
 B08248; io...@lists.linux-foundation.org; bhelg...@google.com; linuxppc-
 d...@lists.ozlabs.org; linux-ker...@vger.kernel.org
 Subject: RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

  -Original Message-
  From: Alex Williamson [mailto:alex.william...@redhat.com]
  Sent: Monday, November 25, 2013 10:08 PM
  To: Bhushan Bharat-R65777
  Cc: Wood Scott-B07421; linux-...@vger.kernel.org; ag...@suse.de; Yoder
  Stuart- B08248; io...@lists.linux-foundation.org; bhelg...@google.com;
  linuxppc- d...@lists.ozlabs.org; linux-ker...@vger.kernel.org
  Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU
  (PAMU)

  On Mon, 2013-11-25 at 05:33 +, Bharat Bhushan wrote:

-Original Message-
From: Alex Williamson [mailto:alex.william...@redhat.com]
Sent: Friday, November 22, 2013 2:31 AM
To: Wood Scott-B07421
Cc: Bhushan Bharat-R65777; linux-...@vger.kernel.org;
ag...@suse.de; Yoder Stuart-B08248;
io...@lists.linux-foundation.org; bhelg...@google.com; linuxppc-
d...@lists.ozlabs.org; linux-ker...@vger.kernel.org
Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
IOMMU (PAMU)

On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
 On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
  On Thu, 2013-11-21 at 11:20 +, Bharat Bhushan wrote:

-Original Message-
From: Alex Williamson [mailto:alex.william...@redhat.com]
Sent: Thursday, November 21, 2013 12:17 AM
To: Bhushan Bharat-R65777
Cc: j...@8bytes.org; bhelg...@google.com; ag...@suse.de;
Wood Scott-B07421; Yoder Stuart-B08248;
io...@lists.linux-foundation.org; linux-
p...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
ker...@vger.kernel.org; Bhushan Bharat-R65777
Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for
Freescale IOMMU (PAMU)

Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie.
each vfio user has $COUNT regions at their disposal 
exclusively)?

   Number of msi-bank count is system wide and not per
   aperture, But will be
setting windows for banks in the device aperture.
   So say if we are direct assigning 2 pci device (both have
   different iommu
group, so 2 aperture in iommu) to VM.
   Now qemu can make only one call to know how many msi-banks
   are there but
it must set sub-windows for all banks for both pci device in its
respective aperture.

  I'm still confused.  What I want to make sure of is that the
  banks are independent per aperture.  For instance, if we have
  two separate userspace processes operating independently and
  they both chose to use msi bank zero for their device, that's
  bank zero within each aperture and doesn't interfere.  Or
  another way to ask is can a malicious user interfere with
  other users by
  using the wrong bank.
  Thanks,

 They can interfere.

   Want to be sure of how they can interfere?

  What happens if more than one user selects the same MSI bank?
  Minimally, wouldn't that result in the IOMMU blocking transactions
  from the previous user once the new user activates their mapping?

 Yes and no; With current implementation yes but with a minor change no. Later 
 in
 this response I will explain how.

 With this hardware, the only way to prevent that
 is to make sure that a bank is not shared by multiple protection
 contexts.
 For some of our users, though, I believe preventing this is less
 important than the performance benefit.

   So should we let this patch series in without protection?

  No.

I think we need some sort of ownership model around the msi banks then.
Otherwise there's nothing preventing another userspace from
attempting an MSI based attack on other users, or perhaps even on
the host.  VFIO can't allow that.  Thanks,

   We have very few (3 MSI bank on most of chips), so we can not assign
   one to each userspace. What we can do is host and userspace does not
   share a MSI bank while userspace will share a MSI bank.

  Then you probably need VFIO to own the MSI bank and program devices
  into it rather than exposing the MSI banks to userspace to let them have
 direct access.

 Overall idea of exposing the details of msi regions to userspace are
  1) User space can define the aperture size to fit MSI mapping in IOMMU.
  2) setup iova for a MSI banks; which is just after guest memory.

 But currently we expose the size and address of MSI banks, passing address
 is of no use and can be problematic.

I am sorry, above information is not correct. Currently neither we

RE: [PATCH 1/9 v2] pci:msi: add weak function for returning msi region info

2013-11-28 Thread Bharat Bhushan



 -Original Message-
 From: linux-pci-ow...@vger.kernel.org [mailto:linux-pci-ow...@vger.kernel.org]
 On Behalf Of Bjorn Helgaas
 Sent: Tuesday, November 26, 2013 5:06 AM
 To: Bhushan Bharat-R65777
 Cc: alex.william...@redhat.com; j...@8bytes.org; ag...@suse.de; Wood Scott-
 B07421; Yoder Stuart-B08248; io...@lists.linux-foundation.org; linux-
 p...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
 ker...@vger.kernel.org; Bhushan Bharat-R65777
 Subject: Re: [PATCH 1/9 v2] pci:msi: add weak function for returning msi 
 region
 info
 
 On Tue, Nov 19, 2013 at 10:47:05AM +0530, Bharat Bhushan wrote:
  In Aperture type of IOMMU (like FSL PAMU), VFIO-iommu system need to
  know the MSI region to map its window in h/w. This patch just defines
  the required weak functions only and will be used by followup patches.
 
  Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
  ---
  v1-v2
   - Added description on struct msi_region
 
   drivers/pci/msi.c   |   22 ++
   include/linux/msi.h |   14 ++
   2 files changed, 36 insertions(+), 0 deletions(-)
 
  diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index
  d5f90d6..2643a29 100644
  --- a/drivers/pci/msi.c
  +++ b/drivers/pci/msi.c
  @@ -67,6 +67,28 @@ int __weak arch_msi_check_device(struct pci_dev *dev, int
 nvec, int type)
  return chip-check_device(chip, dev, nvec, type);  }
 
  +int __weak arch_msi_get_region_count(void) {
  +   return 0;
  +}
  +
  +int __weak arch_msi_get_region(int region_num, struct msi_region
  +*region) {
  +   return 0;
  +}
  +
  +int msi_get_region_count(void)
  +{
  +   return arch_msi_get_region_count();
  +}
  +EXPORT_SYMBOL(msi_get_region_count);
  +
  +int msi_get_region(int region_num, struct msi_region *region) {
  +   return arch_msi_get_region(region_num, region); }
  +EXPORT_SYMBOL(msi_get_region);
  +
   int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int
  type)  {
  struct msi_desc *entry;
  diff --git a/include/linux/msi.h b/include/linux/msi.h index
  b17ead8..ade1480 100644
  --- a/include/linux/msi.h
  +++ b/include/linux/msi.h
  @@ -51,6 +51,18 @@ struct msi_desc {
   };
 
   /*
  + * This structure is used to get
  + * - physical address
  + * - size
  + * of a msi region
  + */
  +struct msi_region {
  +   int region_num; /* MSI region number */
  +   dma_addr_t addr; /* Address of MSI region */
  +   size_t size; /* Size of MSI region */ };
  +
  +/*
* The arch hooks to setup up msi irqs. Those functions are
* implemented as weak symbols so that they /can/ be overriden by
* architecture specific code if needed.
  @@ -64,6 +76,8 @@ void arch_restore_msi_irqs(struct pci_dev *dev, int
  irq);
 
   void default_teardown_msi_irqs(struct pci_dev *dev);  void
  default_restore_msi_irqs(struct pci_dev *dev, int irq);
  +int arch_msi_get_region_count(void);
  +int arch_msi_get_region(int region_num, struct msi_region *region);
 
 It doesn't look like any of this (struct msi_region, msi_get_region(),
 msi_get_region_count()) is actually used by drivers/pci/msi.c, so I don't 
 think
 it needs to be declared in generic code.  It looks like it's only used in
 drivers/vfio/vfio_iommu_fsl_pamu.c, where you already know you have an FSL
 IOMMU, and you can just call FSL-specific interfaces directly.

Thanks Bjorn,

Want to be sure of what you are suggesting.

What I understood is that we define these (struct msi_region, msi_get_region(), 
msi_get_region_count()) in arch/powerpc/include/fsl_msi.h (a new file). Include 
this header file directly in driver/vfio/vfio_iommu_fsl_pamu.c

Same also applies for msi_set_iova() in patch-5 ?

-Bharat

 
 Bjorn
 
 
   struct msi_chip {
  struct module *owner;
  --
  1.7.0.4
 
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-pci in the 
 body
 of a message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Error in frreing hugepages with preemption enabled

2013-11-28 Thread Bharat Bhushan

Hi Alex,

I am running KVM guest with host kernel having CONFIG_PREEMPT enabled. With 
allocated pages things seems to work fine but I uses hugepages for guest I see 
below prints when quit from qemu.

(qemu) QEMU waiting for connection on: telnet:0.0.0.0:,server
qemu-system-ppc64: pci_add_option_rom: failed to find romfile efi-virtio.rom
q
debug_smp_processor_id: 15 callbacks suppressed
BUG: using smp_processor_id() in preemptible [] code: 
qemu-system-ppc/2504
caller is .free_hugepd_range+0xb0/0x21c
CPU: 1 PID: 2504 Comm: qemu-system-ppc Not tainted 3.12.0-rc3-07733-gabf4907 
#175
Call Trace:
[c000fb433400] [c0007d38] .show_stack+0x7c/0x1cc (unreliable)
[c000fb4334d0] [c05e8ce0] .dump_stack+0x9c/0xf4
[c000fb433560] [c02de5ec] .debug_smp_processor_id+0x108/0x11c
[c000fb4335f0] [c0025e10] .free_hugepd_range+0xb0/0x21c
[c000fb433680] [c00265bc] .hugetlb_free_pgd_range+0x2c8/0x3b0
[c000fb4337a0] [c00e428c] .free_pgtables+0x14c/0x158
[c000fb433840] [c00ef320] .exit_mmap+0xec/0x194
[c000fb433960] [c004d780] .mmput+0x64/0x124
[c000fb4339e0] [c0051f40] .do_exit+0x29c/0x9c8
[c000fb433ae0] [c00527c8] .do_group_exit+0x50/0xc4
[c000fb433b70] [c00606a0] .get_signal_to_deliver+0x21c/0x5d8
[c000fb433c70] [c0009b08] .do_signal+0x54/0x278
[c000fb433db0] [c0009e50] .do_notify_resume+0x64/0x78
[c000fb433e30] [cb44] .ret_from_except_lite+0x70/0x74


This mean that free_hugepd_range() must be called with preemption enabled.
I tried below change and this seems to work fine (I am not having expertise in 
this area so not sure this is correct way)

diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index d67db4b..6bf8459 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -563,8 +563,10 @@ static void hugetlb_free_pmd_range(struct mmu_gather *tlb, 
pud_t *pud,
 */
next = addr + (1  hugepd_shift(*(hugepd_t *)pmd));
 #endif
+   preempt_disable();
free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT,
  addr, next, floor, ceiling);
+   preempt_enable();
} while (addr = next, addr != end);
 
start = PUD_MASK;


Thanks
-Bharat

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

2013-11-27 Thread Bharat Bhushan

 -Original Message-
 From: Alex Williamson [mailto:alex.william...@redhat.com]
 Sent: Monday, November 25, 2013 10:08 PM
 To: Bhushan Bharat-R65777
 Cc: Wood Scott-B07421; linux-...@vger.kernel.org; ag...@suse.de; Yoder Stuart-
 B08248; io...@lists.linux-foundation.org; bhelg...@google.com; linuxppc-
 d...@lists.ozlabs.org; linux-ker...@vger.kernel.org
 Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

 On Mon, 2013-11-25 at 05:33 +, Bharat Bhushan wrote:

   -Original Message-
   From: Alex Williamson [mailto:alex.william...@redhat.com]
   Sent: Friday, November 22, 2013 2:31 AM
   To: Wood Scott-B07421
   Cc: Bhushan Bharat-R65777; linux-...@vger.kernel.org; ag...@suse.de;
   Yoder Stuart-B08248; io...@lists.linux-foundation.org;
   bhelg...@google.com; linuxppc- d...@lists.ozlabs.org;
   linux-ker...@vger.kernel.org
   Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
   IOMMU (PAMU)

   On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
 On Thu, 2013-11-21 at 11:20 +, Bharat Bhushan wrote:

   -Original Message-
   From: Alex Williamson [mailto:alex.william...@redhat.com]
   Sent: Thursday, November 21, 2013 12:17 AM
   To: Bhushan Bharat-R65777
   Cc: j...@8bytes.org; bhelg...@google.com; ag...@suse.de;
   Wood Scott-B07421; Yoder Stuart-B08248;
   io...@lists.linux-foundation.org; linux-
   p...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
   ker...@vger.kernel.org; Bhushan Bharat-R65777
   Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for
   Freescale IOMMU (PAMU)

   Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each
   vfio user has $COUNT regions at their disposal exclusively)?

  Number of msi-bank count is system wide and not per aperture,
  But will be
   setting windows for banks in the device aperture.
  So say if we are direct assigning 2 pci device (both have
  different iommu
   group, so 2 aperture in iommu) to VM.
  Now qemu can make only one call to know how many msi-banks are
  there but
   it must set sub-windows for all banks for both pci device in its
   respective aperture.

 I'm still confused.  What I want to make sure of is that the
 banks are independent per aperture.  For instance, if we have
 two separate userspace processes operating independently and
 they both chose to use msi bank zero for their device, that's
 bank zero within each aperture and doesn't interfere.  Or
 another way to ask is can a malicious user interfere with other users 
 by
 using the wrong bank.
 Thanks,

They can interfere.

  Want to be sure of how they can interfere?

 What happens if more than one user selects the same MSI bank?
 Minimally, wouldn't that result in the IOMMU blocking transactions from the
 previous user once the new user activates their mapping?

Yes and no; With current implementation yes but with a minor change no. Later 
in this response I will explain how.

With this hardware, the only way to prevent that
is to make sure that a bank is not shared by multiple protection 
contexts.
For some of our users, though, I believe preventing this is less
important than the performance benefit.

  So should we let this patch series in without protection?

 No.

   I think we need some sort of ownership model around the msi banks then.
   Otherwise there's nothing preventing another userspace from
   attempting an MSI based attack on other users, or perhaps even on
   the host.  VFIO can't allow that.  Thanks,

  We have very few (3 MSI bank on most of chips), so we can not assign
  one to each userspace. What we can do is host and userspace does not
  share a MSI bank while userspace will share a MSI bank.

 Then you probably need VFIO to own the MSI bank and program devices into it
 rather than exposing the MSI banks to userspace to let them have direct 
 access.

Overall idea of exposing the details of msi regions to userspace are
 1) User space can define the aperture size to fit MSI mapping in IOMMU.
 2) setup iova for a MSI banks; which is just after guest memory. 

But currently we expose the size and address of MSI banks, passing address 
is of no use and can be problematic.
If we just provide the size of MSI bank to userspace then userspace cannot do 
anything wrong.

While it is still the responsibility of host (MSI+VFIO) to compose MSI-address 
and MSI-data; so I think this should look fine.

 Thanks,

 Alex

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

2013-11-24 Thread Bharat Bhushan

 -Original Message-
 From: Alex Williamson [mailto:alex.william...@redhat.com]
 Sent: Friday, November 22, 2013 2:31 AM
 To: Wood Scott-B07421
 Cc: Bhushan Bharat-R65777; linux-...@vger.kernel.org; ag...@suse.de; Yoder
 Stuart-B08248; io...@lists.linux-foundation.org; bhelg...@google.com; 
 linuxppc-
 d...@lists.ozlabs.org; linux-ker...@vger.kernel.org
 Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

 On Thu, 2013-11-21 at 14:47 -0600, Scott Wood wrote:
  On Thu, 2013-11-21 at 13:43 -0700, Alex Williamson wrote:
   On Thu, 2013-11-21 at 11:20 +, Bharat Bhushan wrote:

 -Original Message-
 From: Alex Williamson [mailto:alex.william...@redhat.com]
 Sent: Thursday, November 21, 2013 12:17 AM
 To: Bhushan Bharat-R65777
 Cc: j...@8bytes.org; bhelg...@google.com; ag...@suse.de; Wood
 Scott-B07421; Yoder Stuart-B08248;
 io...@lists.linux-foundation.org; linux- p...@vger.kernel.org;
 linuxppc-dev@lists.ozlabs.org; linux- ker...@vger.kernel.org;
 Bhushan Bharat-R65777
 Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale
 IOMMU (PAMU)

 Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each
 vfio user has $COUNT regions at their disposal exclusively)?

Number of msi-bank count is system wide and not per aperture, But will 
be
 setting windows for banks in the device aperture.
So say if we are direct assigning 2 pci device (both have different 
iommu
 group, so 2 aperture in iommu) to VM.
Now qemu can make only one call to know how many msi-banks are there but
 it must set sub-windows for all banks for both pci device in its respective
 aperture.

   I'm still confused.  What I want to make sure of is that the banks
   are independent per aperture.  For instance, if we have two separate
   userspace processes operating independently and they both chose to
   use msi bank zero for their device, that's bank zero within each
   aperture and doesn't interfere.  Or another way to ask is can a
   malicious user interfere with other users by using the wrong bank.
   Thanks,

  They can interfere.

Want to be sure of how they can interfere?

  With this hardware, the only way to prevent that
  is to make sure that a bank is not shared by multiple protection contexts.
  For some of our users, though, I believe preventing this is less
  important than the performance benefit.

So should we let this patch series in without protection?

 I think we need some sort of ownership model around the msi banks then.
 Otherwise there's nothing preventing another userspace from attempting an MSI
 based attack on other users, or perhaps even on the host.  VFIO can't allow
 that.  Thanks,

We have very few (3 MSI bank on most of chips), so we can not assign one to 
each userspace. What we can do is host and userspace does not share a MSI bank 
while userspace will share a MSI bank.

Thanks
-Bharat

 Alex

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

2013-11-21 Thread Bharat Bhushan

 -Original Message-
 From: Alex Williamson [mailto:alex.william...@redhat.com]
 Sent: Thursday, November 21, 2013 12:17 AM
 To: Bhushan Bharat-R65777
 Cc: j...@8bytes.org; bhelg...@google.com; ag...@suse.de; Wood Scott-B07421;
 Yoder Stuart-B08248; io...@lists.linux-foundation.org; linux-
 p...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
 ker...@vger.kernel.org; Bhushan Bharat-R65777
 Subject: Re: [PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

 On Tue, 2013-11-19 at 10:47 +0530, Bharat Bhushan wrote:
  From: Bharat Bhushan bharat.bhus...@freescale.com

  PAMU (FSL IOMMU) has a concept of primary window and subwindows.
  Primary window corresponds to the complete guest iova address space
  (including MSI space), with respect to IOMMU_API this is termed as
  geometry. IOVA Base of subwindow is determined from the number of
  subwindows (configurable using iommu API).
  MSI I/O page must be within the geometry and maximum supported
  subwindows, so MSI IO-page is setup just after guest memory iova space.

  So patch 1/9-4/9(inclusive) are for defining the interface to get:
- Number of MSI regions (which is number of MSI banks for powerpc)
- MSI-region address range: Physical page which have the
  address/addresses used for generating MSI interrupt
  and size of the page.

  Patch 5/9-7/9(inclusive) is defining the interface of setting up MSI
  iova-base for a msi region(bank) for a device. so that when
  msi-message will be composed then this configured iova will be used.
  Earlier we were using iommu interface for getting the configured iova
  which was not currect and Alex Williamson suggeested this type of interface.

  patch 8/9 moves some common functions in a separate file so that these
  can be used by FSL_PAMU implementation (next patch uses this).
  These will be used later for iommu-none implementation. I believe we
  can do more of this but will take step by step.

  Finally last patch actually adds the support for FSL-PAMU :)

 Patches 1-3: msi_get_region needs to return an error an error (probably
 -EINVAL) if called on a path where there's no backend implementation.
 Otherwise the caller doesn't know that the data in the region pointer isn't
 valid.

will correct.

 Patches 56: same as above for msi_set_iova, return an error if no backend
 implementation.

Ok

 Patch 7: Why does fsl_msi_del_iova_device bother to return anything if it's
 always zero?  Return -ENODEV when not found?

Will make -ENODEV.

 Patch 9:

 vfio_handle_get_attr() passes random kernel data back to userspace in the 
 event
 of iommu_domain_get_attr() error.

Will correct.

 vfio_handle_set_attr(): I don't see any data validation happening, is
 iommu_domain_set_attr() really that safe?

We do not need any data validation here and iommu driver does whatever needed.
So yes,  iommu_domain_set_attr() is safe.

 For both of those, drop the pr_err on unknown attribute, it's sufficient to
 return error.

ok

 Is VFIO_IOMMU_PAMU_GET_MSI_BANK_COUNT per aperture (ie. each vfio user has
 $COUNT regions at their disposal exclusively)?

Number of msi-bank count is system wide and not per aperture, But will be 
setting windows for banks in the device aperture.
So say if we are direct assigning 2 pci device (both have different iommu 
group, so 2 aperture in iommu) to VM.
Now qemu can make only one call to know how many msi-banks are there but it 
must set sub-windows for all banks for both pci device in its respective 
aperture.

Thanks
-Bharat

  Thanks,

 Alex

  v1-v2
   - Added interface for setting msi iova for a msi region for a device.
 Earlier I added iommu interface for same but as per comment that is
 removed and now created a direct interface between vfio and msi.
   - Incorporated review comments (details is in individual patch)

  Bharat Bhushan (9):
pci:msi: add weak function for returning msi region info
pci: msi: expose msi region information functions
powerpc: pci: Add arch specific msi region interface
powerpc: msi: Extend the msi region interface to get info from
  fsl_msi
pci/msi: interface to set an iova for a msi region
powerpc: pci: Extend msi iova page setup to arch specific
pci: msi: Extend msi iova setting interface to powerpc arch
vfio: moving some functions in common file
vfio pci: Add vfio iommu implementation for FSL_PAMU

   arch/powerpc/include/asm/machdep.h |   10 +
   arch/powerpc/kernel/msi.c  |   28 +
   arch/powerpc/sysdev/fsl_msi.c  |  132 +-
   arch/powerpc/sysdev/fsl_msi.h  |   25 +-
   drivers/pci/msi.c  |   35 ++
   drivers/vfio/Kconfig   |6 +
   drivers/vfio/Makefile  |5 +-
   drivers/vfio/vfio_iommu_common.c   |  227 
   drivers/vfio/vfio_iommu_common.h   |   27 +
   drivers/vfio/vfio_iommu_fsl_pamu.c | 1003

   drivers/vfio/vfio_iommu_type1.c|  206

[PATCH 0/9 v2] vfio-pci: add support for Freescale IOMMU (PAMU)

2013-11-18 Thread Bharat Bhushan

From: Bharat Bhushan bharat.bhus...@freescale.com

PAMU (FSL IOMMU) has a concept of primary window and subwindows.
Primary window corresponds to the complete guest iova address space
(including MSI space), with respect to IOMMU_API this is termed as
geometry. IOVA Base of subwindow is determined from the number of
subwindows (configurable using iommu API).
MSI I/O page must be within the geometry and maximum supported
subwindows, so MSI IO-page is setup just after guest memory iova space.

So patch 1/9-4/9(inclusive) are for defining the interface to get:
  - Number of MSI regions (which is number of MSI banks for powerpc)
  - MSI-region address range: Physical page which have the
address/addresses used for generating MSI interrupt
and size of the page.

Patch 5/9-7/9(inclusive) is defining the interface of setting up
MSI iova-base for a msi region(bank) for a device. so that when
msi-message will be composed then this configured iova will be used.
Earlier we were using iommu interface for getting the configured iova
which was not currect and Alex Williamson suggeested this type of interface.

patch 8/9 moves some common functions in a separate file so that these
can be used by FSL_PAMU implementation (next patch uses this).
These will be used later for iommu-none implementation. I believe we
can do more of this but will take step by step.

Finally last patch actually adds the support for FSL-PAMU :)

v1-v2
 - Added interface for setting msi iova for a msi region for a device.
   Earlier I added iommu interface for same but as per comment that is
   removed and now created a direct interface between vfio and msi.
 - Incorporated review comments (details is in individual patch)

Bharat Bhushan (9):
  pci:msi: add weak function for returning msi region info
  pci: msi: expose msi region information functions
  powerpc: pci: Add arch specific msi region interface
  powerpc: msi: Extend the msi region interface to get info from
fsl_msi
  pci/msi: interface to set an iova for a msi region
  powerpc: pci: Extend msi iova page setup to arch specific
  pci: msi: Extend msi iova setting interface to powerpc arch
  vfio: moving some functions in common file
  vfio pci: Add vfio iommu implementation for FSL_PAMU

 arch/powerpc/include/asm/machdep.h |   10 +
 arch/powerpc/kernel/msi.c  |   28 +
 arch/powerpc/sysdev/fsl_msi.c  |  132 +-
 arch/powerpc/sysdev/fsl_msi.h  |   25 +-
 drivers/pci/msi.c  |   35 ++
 drivers/vfio/Kconfig   |6 +
 drivers/vfio/Makefile  |5 +-
 drivers/vfio/vfio_iommu_common.c   |  227 
 drivers/vfio/vfio_iommu_common.h   |   27 +
 drivers/vfio/vfio_iommu_fsl_pamu.c | 1003 
 drivers/vfio/vfio_iommu_type1.c|  206 +
 include/linux/msi.h|   14 +
 include/linux/pci.h|   21 +
 include/uapi/linux/vfio.h  |  100 
 14 files changed, 1623 insertions(+), 216 deletions(-)
 create mode 100644 drivers/vfio/vfio_iommu_common.c
 create mode 100644 drivers/vfio/vfio_iommu_common.h
 create mode 100644 drivers/vfio/vfio_iommu_fsl_pamu.c


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/9 v2] pci: msi: expose msi region information functions

2013-11-18 Thread Bharat Bhushan

So by now we have defined all the interfaces for getting the msi region,
this patch expose the interface to linux subsystem. These will be used by
vfio subsystem for setting up iommu for MSI interrupt of direct assignment
devices.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v2
 - None

 include/linux/pci.h |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index da172f9..c587034 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1142,6 +1142,7 @@ struct msix_entry {
u16 entry;  /* driver uses to specify entry, OS writes */
 };
 
+struct msi_region;
 
 #ifndef CONFIG_PCI_MSI
 static inline int pci_enable_msi_block(struct pci_dev *dev, unsigned int nvec)
@@ -1184,6 +1185,16 @@ static inline int pci_msi_enabled(void)
 {
return 0;
 }
+
+static inline int msi_get_region_count(void)
+{
+   return 0;
+}
+
+static inline int msi_get_region(int region_num, struct msi_region *region)
+{
+   return 0;
+}
 #else
 int pci_enable_msi_block(struct pci_dev *dev, unsigned int nvec);
 int pci_enable_msi_block_auto(struct pci_dev *dev, unsigned int *maxvec);
@@ -1196,6 +1207,8 @@ void pci_disable_msix(struct pci_dev *dev);
 void msi_remove_pci_irq_vectors(struct pci_dev *dev);
 void pci_restore_msi_state(struct pci_dev *dev);
 int pci_msi_enabled(void);
+int msi_get_region_count(void);
+int msi_get_region(int region_num, struct msi_region *region);
 #endif
 
 #ifdef CONFIG_PCIEPORTBUS
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/9 v2] pci:msi: add weak function for returning msi region info

2013-11-18 Thread Bharat Bhushan

In Aperture type of IOMMU (like FSL PAMU), VFIO-iommu system need to know
the MSI region to map its window in h/w. This patch just defines the
required weak functions only and will be used by followup patches.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v2
 - Added description on struct msi_region 

 drivers/pci/msi.c   |   22 ++
 include/linux/msi.h |   14 ++
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index d5f90d6..2643a29 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -67,6 +67,28 @@ int __weak arch_msi_check_device(struct pci_dev *dev, int 
nvec, int type)
return chip-check_device(chip, dev, nvec, type);
 }
 
+int __weak arch_msi_get_region_count(void)
+{
+   return 0;
+}
+
+int __weak arch_msi_get_region(int region_num, struct msi_region *region)
+{
+   return 0;
+}
+
+int msi_get_region_count(void)
+{
+   return arch_msi_get_region_count();
+}
+EXPORT_SYMBOL(msi_get_region_count);
+
+int msi_get_region(int region_num, struct msi_region *region)
+{
+   return arch_msi_get_region(region_num, region);
+}
+EXPORT_SYMBOL(msi_get_region);
+
 int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
struct msi_desc *entry;
diff --git a/include/linux/msi.h b/include/linux/msi.h
index b17ead8..ade1480 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -51,6 +51,18 @@ struct msi_desc {
 };
 
 /*
+ * This structure is used to get
+ * - physical address
+ * - size
+ * of a msi region
+ */
+struct msi_region {
+   int region_num; /* MSI region number */
+   dma_addr_t addr; /* Address of MSI region */
+   size_t size; /* Size of MSI region */
+};
+
+/*
  * The arch hooks to setup up msi irqs. Those functions are
  * implemented as weak symbols so that they /can/ be overriden by
  * architecture specific code if needed.
@@ -64,6 +76,8 @@ void arch_restore_msi_irqs(struct pci_dev *dev, int irq);
 
 void default_teardown_msi_irqs(struct pci_dev *dev);
 void default_restore_msi_irqs(struct pci_dev *dev, int irq);
+int arch_msi_get_region_count(void);
+int arch_msi_get_region(int region_num, struct msi_region *region);
 
 struct msi_chip {
struct module *owner;
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 3/9 v2] powerpc: pci: Add arch specific msi region interface

2013-11-18 Thread Bharat Bhushan

This patch adds the interface to get the msi region information from arch
specific code. The machine spicific code is not yet defined.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v2
 - None

 arch/powerpc/include/asm/machdep.h |8 
 arch/powerpc/kernel/msi.c  |   18 ++
 2 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h 
b/arch/powerpc/include/asm/machdep.h
index 8b48090..8d1b787 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -30,6 +30,7 @@ struct file;
 struct pci_controller;
 struct kimage;
 struct pci_host_bridge;
+struct msi_region;
 
 struct machdep_calls {
char*name;
@@ -124,6 +125,13 @@ struct machdep_calls {
int (*setup_msi_irqs)(struct pci_dev *dev,
  int nvec, int type);
void(*teardown_msi_irqs)(struct pci_dev *dev);
+
+   /* Returns the number of MSI regions (banks) */
+   int (*msi_get_region_count)(void);
+
+   /* Returns the requested region's address and size */
+   int (*msi_get_region)(int region_num,
+ struct msi_region *region);
 #endif
 
void(*restart)(char *cmd);
diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c
index 8bbc12d..1a67787 100644
--- a/arch/powerpc/kernel/msi.c
+++ b/arch/powerpc/kernel/msi.c
@@ -13,6 +13,24 @@
 
 #include asm/machdep.h
 
+int arch_msi_get_region_count(void)
+{
+   if (ppc_md.msi_get_region_count) {
+   pr_debug(msi: Using platform get_region_count routine.\n);
+   return ppc_md.msi_get_region_count();
+   }
+   return 0;
+}
+
+int arch_msi_get_region(int region_num, struct msi_region *region)
+{
+   if (ppc_md.msi_get_region) {
+   pr_debug(msi: Using platform get_region routine.\n);
+   return ppc_md.msi_get_region(region_num, region);
+   }
+   return 0;
+}
+
 int arch_msi_check_device(struct pci_dev* dev, int nvec, int type)
 {
if (!ppc_md.setup_msi_irqs || !ppc_md.teardown_msi_irqs) {
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 4/9 v2] powerpc: msi: Extend the msi region interface to get info from fsl_msi

2013-11-18 Thread Bharat Bhushan

The FSL MSI will provide the interface to get:
  - Number of MSI regions (which is number of MSI banks for powerpc)
  - Get the region address range: Physical page which have the
address/addresses used for generating MSI interrupt
and size of the page.

These are required to create IOMMU (Freescale PAMU) mapping for
devices which are directly assigned using VFIO.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v2
 - Atomic increment of bank index for parallel probe of msi node 

 arch/powerpc/sysdev/fsl_msi.c |   42 +++-
 arch/powerpc/sysdev/fsl_msi.h |   11 -
 2 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index 77efbae..eeebbf0 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -109,6 +109,34 @@ static int fsl_msi_init_allocator(struct fsl_msi *msi_data)
return 0;
 }
 
+static int fsl_msi_get_region_count(void)
+{
+   int count = 0;
+   struct fsl_msi *msi_data;
+
+   list_for_each_entry(msi_data, msi_head, list)
+   count++;
+
+   return count;
+}
+
+static int fsl_msi_get_region(int region_num, struct msi_region *region)
+{
+   struct fsl_msi *msi_data;
+
+   list_for_each_entry(msi_data, msi_head, list) {
+   if (msi_data-bank_index == region_num) {
+   region-region_num = msi_data-bank_index;
+   /* Setting PAGE_SIZE as MSIIR is a 4 byte register */
+   region-size = PAGE_SIZE;
+   region-addr = msi_data-msiir  ~(region-size - 1);
+   return 0;
+   }
+   }
+
+   return -ENODEV;
+}
+
 static int fsl_msi_check_device(struct pci_dev *pdev, int nvec, int type)
 {
if (type == PCI_CAP_ID_MSIX)
@@ -150,7 +178,8 @@ static void fsl_compose_msi_msg(struct pci_dev *pdev, int 
hwirq,
if (reg  (len == sizeof(u64)))
address = be64_to_cpup(reg);
else
-   address = fsl_pci_immrbar_base(hose) + msi_data-msiir_offset;
+   address = fsl_pci_immrbar_base(hose) +
+  (msi_data-msiir  0xf);
 
msg-address_lo = lower_32_bits(address);
msg-address_hi = upper_32_bits(address);
@@ -393,6 +422,7 @@ static int fsl_of_msi_probe(struct platform_device *dev)
const struct fsl_msi_feature *features;
int len;
u32 offset;
+   static atomic_t bank_index = ATOMIC_INIT(-1);
 
match = of_match_device(fsl_of_msi_ids, dev-dev);
if (!match)
@@ -436,18 +466,15 @@ static int fsl_of_msi_probe(struct platform_device *dev)
dev-dev.of_node-full_name);
goto error_out;
}
-   msi-msiir_offset =
-   features-msiir_offset + (res.start  0xf);
 
/*
 * First read the MSIIR/MSIIR1 offset from dts
 * On failure use the hardcode MSIIR offset
 */
if (of_address_to_resource(dev-dev.of_node, 1, msiir))
-   msi-msiir_offset = features-msiir_offset +
-   (res.start  MSIIR_OFFSET_MASK);
+   msi-msiir = res.start + features-msiir_offset;
else
-   msi-msiir_offset = msiir.start  MSIIR_OFFSET_MASK;
+   msi-msiir = msiir.start;
}
 
msi-feature = features-fsl_pic_ip;
@@ -521,6 +548,7 @@ static int fsl_of_msi_probe(struct platform_device *dev)
}
}
 
+   msi-bank_index = atomic_inc_return(bank_index);
list_add_tail(msi-list, msi_head);
 
/* The multiple setting ppc_md.setup_msi_irqs will not harm things */
@@ -528,6 +556,8 @@ static int fsl_of_msi_probe(struct platform_device *dev)
ppc_md.setup_msi_irqs = fsl_setup_msi_irqs;
ppc_md.teardown_msi_irqs = fsl_teardown_msi_irqs;
ppc_md.msi_check_device = fsl_msi_check_device;
+   ppc_md.msi_get_region_count = fsl_msi_get_region_count;
+   ppc_md.msi_get_region = fsl_msi_get_region;
} else if (ppc_md.setup_msi_irqs != fsl_setup_msi_irqs) {
dev_err(dev-dev, Different MSI driver already installed!\n);
err = -ENODEV;
diff --git a/arch/powerpc/sysdev/fsl_msi.h b/arch/powerpc/sysdev/fsl_msi.h
index df9aa9f..a2cc5a2 100644
--- a/arch/powerpc/sysdev/fsl_msi.h
+++ b/arch/powerpc/sysdev/fsl_msi.h
@@ -31,14 +31,21 @@ struct fsl_msi {
struct irq_domain *irqhost;
 
unsigned long cascade_irq;
-
-   u32 msiir_offset; /* Offset of MSIIR, relative to start of CCSR */
+   phys_addr_t msiir; /* MSIIR Address in CCSR */
u32 ibs_shift; /* Shift of interrupt bit select */
u32 srs_shift; /* Shift of the shared interrupt

[PATCH 5/9 v2] pci/msi: interface to set an iova for a msi region

2013-11-18 Thread Bharat Bhushan

This patch defines an interface by which a msi page
can be mapped to a specific iova page.

This is a requirement in aperture type of IOMMUs (like Freescale PAMU),
where we map msi iova page just after guest memory iova address.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v2
 - new patch

 drivers/pci/msi.c   |   13 +
 include/linux/pci.h |8 
 2 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 2643a29..040609f 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -77,6 +77,19 @@ int __weak arch_msi_get_region(int region_num, struct 
msi_region *region)
return 0;
 }
 
+int __weak arch_msi_set_iova(struct pci_dev *pdev, int region_num,
+dma_addr_t iova, bool set)
+{
+   return 0;
+}
+
+int msi_set_iova(struct pci_dev *pdev, int region_num,
+dma_addr_t iova, bool set)
+{
+   return arch_msi_set_iova(pdev, region_num, iova, set);
+}
+EXPORT_SYMBOL(msi_set_iova);
+
 int msi_get_region_count(void)
 {
return arch_msi_get_region_count();
diff --git a/include/linux/pci.h b/include/linux/pci.h
index c587034..c6d3e58 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1195,6 +1195,12 @@ static inline int msi_get_region(int region_num, struct 
msi_region *region)
 {
return 0;
 }
+
+static inline int msi_set_iova(struct pci_dev *pdev, int region_num,
+  dma_addr_t iova, bool set)
+{
+   return 0;
+}
 #else
 int pci_enable_msi_block(struct pci_dev *dev, unsigned int nvec);
 int pci_enable_msi_block_auto(struct pci_dev *dev, unsigned int *maxvec);
@@ -1209,6 +1215,8 @@ void pci_restore_msi_state(struct pci_dev *dev);
 int pci_msi_enabled(void);
 int msi_get_region_count(void);
 int msi_get_region(int region_num, struct msi_region *region);
+int msi_set_iova(struct pci_dev *pdev, int region_num,
+dma_addr_t iova, bool set);
 #endif
 
 #ifdef CONFIG_PCIEPORTBUS
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 6/9 v2] powerpc: pci: Extend msi iova page setup to arch specific

2013-11-18 Thread Bharat Bhushan

This patch extend the interface to arch specific code for setting
msi iova address for a msi page. Machine specific code is not yet
implemented.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v2
 - new patch

 arch/powerpc/include/asm/machdep.h |2 ++
 arch/powerpc/kernel/msi.c  |   10 ++
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h 
b/arch/powerpc/include/asm/machdep.h
index 8d1b787..e87b806 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -132,6 +132,8 @@ struct machdep_calls {
/* Returns the requested region's address and size */
int (*msi_get_region)(int region_num,
  struct msi_region *region);
+   int (*msi_set_iova)(struct pci_dev *pdev, int region_num,
+   dma_addr_t iova, bool set);
 #endif
 
void(*restart)(char *cmd);
diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c
index 1a67787..e2bd555 100644
--- a/arch/powerpc/kernel/msi.c
+++ b/arch/powerpc/kernel/msi.c
@@ -13,6 +13,16 @@
 
 #include asm/machdep.h
 
+int arch_msi_set_iova(struct pci_dev *pdev, int region_num,
+ dma_addr_t iova, bool set)
+{
+   if (ppc_md.msi_set_iova) {
+   pr_debug(msi: Using platform get_region_count routine.\n);
+   return ppc_md.msi_set_iova(pdev, region_num, iova, set);
+   }
+   return 0;
+}
+
 int arch_msi_get_region_count(void)
 {
if (ppc_md.msi_get_region_count) {
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 7/9 v2] pci: msi: Extend msi iova setting interface to powerpc arch

2013-11-18 Thread Bharat Bhushan

Now we Keep track of devices which have msi page mapping to specific
iova page for all msi bank. When composing MSI address and data then
this list will be traversed. If device found in the list then use
configured iova page otherwise iova page will be taken as before.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v2
 - new patch

 arch/powerpc/sysdev/fsl_msi.c |   90 +
 arch/powerpc/sysdev/fsl_msi.h |   16 ++-
 2 files changed, 104 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index eeebbf0..52d2beb 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -137,6 +137,75 @@ static int fsl_msi_get_region(int region_num, struct 
msi_region *region)
return -ENODEV;
 }
 
+/* Add device to the list which have iova page mapping */
+static int fsl_msi_add_iova_device(struct fsl_msi *msi_data,
+  struct pci_dev *pdev, dma_addr_t iova)
+{
+   struct fsl_msi_device *device;
+
+   mutex_lock(msi_data-lock);
+   list_for_each_entry(device, msi_data-device_list, list) {
+   /* If mapping already exits then update with new page mapping */
+   if (device-dev == pdev) {
+   device-iova = iova;
+   mutex_unlock(msi_data-lock);
+   return 0;
+   }
+   }
+
+   device = kzalloc(sizeof(struct fsl_msi_device), GFP_KERNEL);
+   if (!device) {
+   pr_err(%s: Memory allocation failed\n, __func__);
+   mutex_unlock(msi_data-lock);
+   return -ENOMEM;
+   }
+
+   device-dev = pdev;
+   device-iova = iova;
+   list_add_tail(device-list, msi_data-device_list);
+   mutex_unlock(msi_data-lock);
+   return 0;
+}
+
+/* Remove device to the list which have iova page mapping */
+static int fsl_msi_del_iova_device(struct fsl_msi *msi_data,
+  struct pci_dev *pdev)
+{
+   struct fsl_msi_device *device;
+
+   mutex_lock(msi_data-lock);
+   list_for_each_entry(device, msi_data-device_list, list) {
+   if (device-dev == pdev) {
+   list_del(device-list);
+   kfree(device);
+   break;
+   }
+   }
+   mutex_unlock(msi_data-lock);
+   return 0;
+}
+
+/* set/clear device iova mapping for the requested msi region */
+static int fsl_msi_set_iova(struct pci_dev *pdev, int region_num,
+   dma_addr_t iova, bool set)
+{
+   struct fsl_msi *msi_data;
+   int ret = -EINVAL;
+
+   list_for_each_entry(msi_data, msi_head, list) {
+   if (msi_data-bank_index != region_num)
+   continue;
+
+   if (set)
+   ret = fsl_msi_add_iova_device(msi_data, pdev, iova);
+   else
+   ret = fsl_msi_del_iova_device(msi_data, pdev);
+
+   break;
+   }
+   return ret;
+}
+
 static int fsl_msi_check_device(struct pci_dev *pdev, int nvec, int type)
 {
if (type == PCI_CAP_ID_MSIX)
@@ -167,6 +236,7 @@ static void fsl_compose_msi_msg(struct pci_dev *pdev, int 
hwirq,
struct msi_msg *msg,
struct fsl_msi *fsl_msi_data)
 {
+   struct fsl_msi_device *device;
struct fsl_msi *msi_data = fsl_msi_data;
struct pci_controller *hose = pci_bus_to_host(pdev-bus);
u64 address; /* Physical address of the MSIIR */
@@ -181,6 +251,15 @@ static void fsl_compose_msi_msg(struct pci_dev *pdev, int 
hwirq,
address = fsl_pci_immrbar_base(hose) +
   (msi_data-msiir  0xf);
 
+   mutex_lock(msi_data-lock);
+   list_for_each_entry(device, msi_data-device_list, list) {
+   if (device-dev == pdev) {
+   address = device-iova | (msi_data-msiir  0xfff);
+   break;
+   }
+   }
+   mutex_unlock(msi_data-lock);
+
msg-address_lo = lower_32_bits(address);
msg-address_hi = upper_32_bits(address);
 
@@ -356,6 +435,7 @@ static int fsl_of_msi_remove(struct platform_device *ofdev)
struct fsl_msi *msi = platform_get_drvdata(ofdev);
int virq, i;
struct fsl_msi_cascade_data *cascade_data;
+   struct fsl_msi_device *device;
 
if (msi-list.prev != NULL)
list_del(msi-list);
@@ -371,6 +451,13 @@ static int fsl_of_msi_remove(struct platform_device *ofdev)
msi_bitmap_free(msi-bitmap);
if ((msi-feature  FSL_PIC_IP_MASK) != FSL_PIC_IP_VMPIC)
iounmap(msi-msi_regs);
+
+   mutex_lock(msi-lock);
+   list_for_each_entry(device, msi-device_list, list) {
+   list_del(device-list);
+   kfree(device);
+   }
+   mutex_unlock(msi-lock

[PATCH 8/9 v2] vfio: moving some functions in common file

2013-11-18 Thread Bharat Bhushan

Some function defined in vfio_iommu_type1.c are generic (not specific
or type1 iommu) and we want to use these for FSL IOMMU (PAMU) and
going forward in iommu-none driver.
So I have created a new file naming vfio_iommu_common.c and moved some
of generic functions into this file.

I Agree (with Alex Williamson and myself :-)) that some more functions
can be moved to this new common file (with some changes in type1/fsl_pamu
and others). But in this patch i avoided doing these changes and
just moved functions which are straight forward and allow me to
get fsl-powerpc vfio framework in place.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v2
 - removed un-necessary header file inclusion
 - mark static function which are internal to *common.c

 drivers/vfio/Makefile|4 +-
 drivers/vfio/vfio_iommu_common.c |  227 ++
 drivers/vfio/vfio_iommu_common.h |   27 +
 drivers/vfio/vfio_iommu_type1.c  |  206 +--
 4 files changed, 257 insertions(+), 207 deletions(-)
 create mode 100644 drivers/vfio/vfio_iommu_common.c
 create mode 100644 drivers/vfio/vfio_iommu_common.h

diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index 72bfabc..c5792ec 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -1,4 +1,4 @@
 obj-$(CONFIG_VFIO) += vfio.o
-obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
-obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
+obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_common.o vfio_iommu_type1.o
+obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_common.o 
vfio_iommu_spapr_tce.o
 obj-$(CONFIG_VFIO_PCI) += pci/
diff --git a/drivers/vfio/vfio_iommu_common.c b/drivers/vfio/vfio_iommu_common.c
new file mode 100644
index 000..08eea71
--- /dev/null
+++ b/drivers/vfio/vfio_iommu_common.c
@@ -0,0 +1,227 @@
+/*
+ * VFIO: Common code for vfio IOMMU support
+ *
+ * Copyright (C) 2012 Red Hat, Inc.  All rights reserved.
+ * Author: Alex Williamson alex.william...@redhat.com
+ * Author: Bharat Bhushan bharat.bhus...@freescale.com
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Derived from original vfio:
+ * Copyright 2010 Cisco Systems, Inc.  All rights reserved.
+ * Author: Tom Lyon, p...@cisco.com
+ */
+
+#include linux/compat.h
+#include linux/iommu.h
+#include linux/module.h
+#include linux/mm.h
+#include linux/slab.h
+#include linux/workqueue.h
+
+static bool disable_hugepages;
+module_param_named(disable_hugepages,
+  disable_hugepages, bool, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(disable_hugepages,
+Disable VFIO IOMMU support for IOMMU hugepages.);
+
+struct vwork {
+   struct mm_struct*mm;
+   longnpage;
+   struct work_struct  work;
+};
+
+/* delayed decrement/increment for locked_vm */
+static void vfio_lock_acct_bg(struct work_struct *work)
+{
+   struct vwork *vwork = container_of(work, struct vwork, work);
+   struct mm_struct *mm;
+
+   mm = vwork-mm;
+   down_write(mm-mmap_sem);
+   mm-locked_vm += vwork-npage;
+   up_write(mm-mmap_sem);
+   mmput(mm);
+   kfree(vwork);
+}
+
+void vfio_lock_acct(long npage)
+{
+   struct vwork *vwork;
+   struct mm_struct *mm;
+
+   if (!current-mm || !npage)
+   return; /* process exited or nothing to do */
+
+   if (down_write_trylock(current-mm-mmap_sem)) {
+   current-mm-locked_vm += npage;
+   up_write(current-mm-mmap_sem);
+   return;
+   }
+
+   /*
+* Couldn't get mmap_sem lock, so must setup to update
+* mm-locked_vm later. If locked_vm were atomic, we
+* wouldn't need this silliness
+*/
+   vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL);
+   if (!vwork)
+   return;
+   mm = get_task_mm(current);
+   if (!mm) {
+   kfree(vwork);
+   return;
+   }
+   INIT_WORK(vwork-work, vfio_lock_acct_bg);
+   vwork-mm = mm;
+   vwork-npage = npage;
+   schedule_work(vwork-work);
+}
+
+/*
+ * Some mappings aren't backed by a struct page, for example an mmap'd
+ * MMIO range for our own or another device.  These use a different
+ * pfn conversion and shouldn't be tracked as locked pages.
+ */
+static bool is_invalid_reserved_pfn(unsigned long pfn)
+{
+   if (pfn_valid(pfn)) {
+   bool reserved;
+   struct page *tail = pfn_to_page(pfn);
+   struct page *head = compound_trans_head(tail);
+   reserved = !!(PageReserved(head));
+   if (head != tail) {
+   /*
+* head is not a dangling pointer
+* (compound_trans_head takes care

[PATCH 9/9 v2] vfio pci: Add vfio iommu implementation for FSL_PAMU

2013-11-18 Thread Bharat Bhushan

This patch adds vfio iommu support for Freescale IOMMU (PAMU -
Peripheral Access Management Unit).

The Freescale PAMU is an aperture-based IOMMU with the following
characteristics.  Each device has an entry in a table in memory
describing the iova-phys mapping. The mapping has:
   -an overall aperture that is power of 2 sized, and has a start iova that
is naturally aligned
   -has 1 or more windows within the aperture
   -number of windows must be power of 2, max is 256
   -size of each window is determined by aperture size / # of windows
   -iova of each window is determined by aperture start iova / # of windows
   -the mapped region in each window can be different than
the window size...mapping must power of 2
   -physical address of the mapping must be naturally aligned
with the mapping size

Some of the code is derived from TYPE1 iommu (driver/vfio/vfio_iommu_type1.c).

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v2
 - Use lock around msi-dma list
 - check for overlap between dma and msi-dma pages
 - Some code cleanup as per various comments

 drivers/vfio/Kconfig   |6 +
 drivers/vfio/Makefile  |1 +
 drivers/vfio/vfio_iommu_fsl_pamu.c | 1003 
 include/uapi/linux/vfio.h  |  100 
 4 files changed, 1110 insertions(+), 0 deletions(-)
 create mode 100644 drivers/vfio/vfio_iommu_fsl_pamu.c

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index 26b3d9d..7d1da26 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -8,11 +8,17 @@ config VFIO_IOMMU_SPAPR_TCE
depends on VFIO  SPAPR_TCE_IOMMU
default n
 
+config VFIO_IOMMU_FSL_PAMU
+   tristate
+   depends on VFIO
+   default n
+
 menuconfig VFIO
tristate VFIO Non-Privileged userspace driver framework
depends on IOMMU_API
select VFIO_IOMMU_TYPE1 if X86
select VFIO_IOMMU_SPAPR_TCE if (PPC_POWERNV || PPC_PSERIES)
+   select VFIO_IOMMU_FSL_PAMU if FSL_PAMU
help
  VFIO provides a framework for secure userspace device drivers.
  See Documentation/vfio.txt for more details.
diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index c5792ec..7461350 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -1,4 +1,5 @@
 obj-$(CONFIG_VFIO) += vfio.o
 obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_common.o vfio_iommu_type1.o
 obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_common.o 
vfio_iommu_spapr_tce.o
+obj-$(CONFIG_VFIO_IOMMU_FSL_PAMU) += vfio_iommu_common.o vfio_iommu_fsl_pamu.o
 obj-$(CONFIG_VFIO_PCI) += pci/
diff --git a/drivers/vfio/vfio_iommu_fsl_pamu.c 
b/drivers/vfio/vfio_iommu_fsl_pamu.c
new file mode 100644
index 000..66efc84
--- /dev/null
+++ b/drivers/vfio/vfio_iommu_fsl_pamu.c
@@ -0,0 +1,1003 @@
+/*
+ * VFIO: IOMMU DMA mapping support for FSL PAMU IOMMU
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright (C) 2013 Freescale Semiconductor, Inc.
+ *
+ * Author: Bharat Bhushan bharat.bhus...@freescale.com
+ *
+ * This file is derived from driver/vfio/vfio_iommu_type1.c
+ *
+ * The Freescale PAMU is an aperture-based IOMMU with the following
+ * characteristics.  Each device has an entry in a table in memory
+ * describing the iova-phys mapping. The mapping has:
+ *  -an overall aperture that is power of 2 sized, and has a start iova that
+ *   is naturally aligned
+ *  -has 1 or more windows within the aperture
+ * -number of windows must be power of 2, max is 256
+ * -size of each window is determined by aperture size / # of windows
+ * -iova of each window is determined by aperture start iova / # of windows
+ * -the mapped region in each window can be different than
+ *  the window size...mapping must power of 2
+ * -physical address of the mapping must be naturally aligned
+ *  with the mapping size
+ */
+
+#include linux/compat.h
+#include linux/device.h
+#include linux/fs.h
+#include linux/iommu.h
+#include linux/module.h
+#include linux/mm.h
+#include linux/pci.h /* pci_bus_type */
+#include linux/sched.h
+#include linux/slab.h
+#include linux/vfio.h
+#include linux/hugetlb.h
+#include linux/msi.h
+#include asm/fsl_pamu_stash.h
+
+#include vfio_iommu_common.h
+
+#define DRIVER_VERSION  0.1
+#define DRIVER_AUTHOR   Bharat Bhushan bharat.bhus...@freescale.com

RE: [PATCH v9] PPC: POWERNV: move iommu_add_device earlier

2013-11-13 Thread Bharat Bhushan



 -Original Message-
 From: Alexey Kardashevskiy [mailto:a...@ozlabs.ru]
 Sent: Wednesday, November 13, 2013 12:00 PM
 To: linuxppc-dev@lists.ozlabs.org
 Cc: Alexey Kardashevskiy; Benjamin Herrenschmidt; Bhushan Bharat-R65777; Alex
 Graf; linux-ker...@vger.kernel.org
 Subject: [PATCH v9] PPC: POWERNV: move iommu_add_device earlier
 
 The current implementation of IOMMU on sPAPR does not use iommu_ops
 and therefore does not call IOMMU API's bus_set_iommu() which
 1) sets iommu_ops for a bus
 2) registers a bus notifier
 Instead, PCI devices are added to IOMMU groups from
 subsys_initcall_sync(tce_iommu_init) which does basically the same
 thing without using iommu_ops callbacks.
 
 However Freescale PAMU driver (https://lkml.org/lkml/2013/7/1/158)
 implements iommu_ops and when tce_iommu_init is called, every PCI device
 is already added to some group so there is a conflict.
 
 This patch does 2 things:
 1. removes the loop in which PCI devices were added to groups and
 adds explicit iommu_add_device() calls to add devices as soon as they get
 the iommu_table pointer assigned to them.
 2. moves a bus notifier to powernv code in order to avoid conflict with
 the notifier from Freescale driver.
 
 iommu_add_device() and iommu_del_device() are public now.
 
 Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru

Tested-by: Bharat Bhushan bharat.bhus...@freescale.com

 ---
 Changes:
 v9:
 * removed KVM from the subject as it is not really a KVM patch so
 PPC mainainter (hi Ben!) can review/include it into his tree
 
 v8:
 * added the check for iommu_group!=NULL before removing device from a group
 as suggested by Wei Yang weiy...@linux.vnet.ibm.com
 
 v2:
 * added a helper - set_iommu_table_base_and_group - which does
 set_iommu_table_base() and iommu_add_device()
 ---
  arch/powerpc/include/asm/iommu.h|  9 +++
  arch/powerpc/kernel/iommu.c | 41 
 +++--
  arch/powerpc/platforms/powernv/pci-ioda.c   |  8 +++---
  arch/powerpc/platforms/powernv/pci-p5ioc2.c |  2 +-
  arch/powerpc/platforms/powernv/pci.c| 33 ++-
  arch/powerpc/platforms/pseries/iommu.c  |  8 +++---
  6 files changed, 55 insertions(+), 46 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/iommu.h 
 b/arch/powerpc/include/asm/iommu.h
 index c34656a..19ad77f 100644
 --- a/arch/powerpc/include/asm/iommu.h
 +++ b/arch/powerpc/include/asm/iommu.h
 @@ -103,6 +103,15 @@ extern struct iommu_table *iommu_init_table(struct
 iommu_table * tbl,
   int nid);
  extern void iommu_register_group(struct iommu_table *tbl,
int pci_domain_number, unsigned long pe_num);
 +extern int iommu_add_device(struct device *dev);
 +extern void iommu_del_device(struct device *dev);
 +
 +static inline void set_iommu_table_base_and_group(struct device *dev,
 +   void *base)
 +{
 + set_iommu_table_base(dev, base);
 + iommu_add_device(dev);
 +}
 
  extern int iommu_map_sg(struct device *dev, struct iommu_table *tbl,
   struct scatterlist *sglist, int nelems,
 diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
 index 572bb5b..ecbf468 100644
 --- a/arch/powerpc/kernel/iommu.c
 +++ b/arch/powerpc/kernel/iommu.c
 @@ -1105,7 +1105,7 @@ void iommu_release_ownership(struct iommu_table *tbl)
  }
  EXPORT_SYMBOL_GPL(iommu_release_ownership);
 
 -static int iommu_add_device(struct device *dev)
 +int iommu_add_device(struct device *dev)
  {
   struct iommu_table *tbl;
   int ret = 0;
 @@ -1134,46 +1134,13 @@ static int iommu_add_device(struct device *dev)
 
   return ret;
  }
 +EXPORT_SYMBOL_GPL(iommu_add_device);
 
 -static void iommu_del_device(struct device *dev)
 +void iommu_del_device(struct device *dev)
  {
   iommu_group_remove_device(dev);
  }
 -
 -static int iommu_bus_notifier(struct notifier_block *nb,
 -   unsigned long action, void *data)
 -{
 - struct device *dev = data;
 -
 - switch (action) {
 - case BUS_NOTIFY_ADD_DEVICE:
 - return iommu_add_device(dev);
 - case BUS_NOTIFY_DEL_DEVICE:
 - iommu_del_device(dev);
 - return 0;
 - default:
 - return 0;
 - }
 -}
 -
 -static struct notifier_block tce_iommu_bus_nb = {
 - .notifier_call = iommu_bus_notifier,
 -};
 -
 -static int __init tce_iommu_init(void)
 -{
 - struct pci_dev *pdev = NULL;
 -
 - BUILD_BUG_ON(PAGE_SIZE  IOMMU_PAGE_SIZE);
 -
 - for_each_pci_dev(pdev)
 - iommu_add_device(pdev-dev);
 -
 - bus_register_notifier(pci_bus_type, tce_iommu_bus_nb);
 - return 0;
 -}
 -
 -subsys_initcall_sync(tce_iommu_init);
 +EXPORT_SYMBOL_GPL(iommu_del_device);
 
  #else
 
 diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c
 b/arch/powerpc/platforms/powernv/pci-ioda.c
 index 084cdfa..614356c 100644
 --- a/arch/powerpc/platforms/powernv

RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20 state and altivec idle

2013-11-10 Thread Bharat Bhushan

  Those codes just for discuss with Bharat. He want to make one flow at
  show_pw20_wait_time/ show_altivec_idle_wait_time function. If we
  do that, we need to initialize pw20_wt/altivec_idle_wt.
 
 I will keep this stuff at show_pw20_wait_time/show_altivec_idle_wait_time
 and add a comment before our discussion.
 
 /*
  * If the value less than 10, this will overflow.
  * From benchmark test, the default wait bit will not be set less than 10bit.
  * Because 10 bit corresponds to the wait entry time is 
 439375573401999609(ns),
  * for wait-entry-idle time this value looks too long, and we cannot use those
  * long time as a default wait-entry time. So overflow could not have 
 happened
  * and we use this calculation method to get wait-entry-idle time.
  */

I think now we will use same calculation code for default value and user set 
value, so adding the comment is not sufficient, we should error out from the 
code if value is less than 10. As default value is not less than 10 so this 
will always work with default value but if user tries to set less than 10 then 
error out and ask user to try more than 9.

-Bharat
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH v4 1/4] dma: imx-sdma: Add sdma firmware version 2 support

2013-11-09 Thread Bharat Bhushan



 -Original Message-
 From: Linuxppc-dev [mailto:linuxppc-dev-
 bounces+bharat.bhushan=freescale@lists.ozlabs.org] On Behalf Of Nicolin 
 Chen
 Sent: Friday, November 08, 2013 4:20 PM
 To: vinod.k...@intel.com; dan.j.willi...@intel.com; s.ha...@pengutronix.de;
 ti...@tabi.org; shawn@linaro.org; broo...@kernel.org
 Cc: mark.rutl...@arm.com; devicet...@vger.kernel.org; alsa-devel@alsa-
 project.org; pawel.m...@arm.com; linux-...@vger.kernel.org;
 swar...@wwwdotorg.org; linux-ker...@vger.kernel.org; rob.herr...@calxeda.com;
 dmaeng...@vger.kernel.org; ijc+devicet...@hellion.org.uk; linuxppc-
 d...@lists.ozlabs.org; linux-arm-ker...@lists.infradead.org
 Subject: [PATCH v4 1/4] dma: imx-sdma: Add sdma firmware version 2 support
 
 On i.MX5/6 series, SDMA is using new version firmware to support SSI dual FIFO
 feature and HDMI Audio (i.MX6Q/DL only). Thus add it.
 
 Signed-off-by: Nicolin Chen b42...@freescale.com
 ---
  drivers/dma/imx-sdma.c | 15 ++-
  include/linux/platform_data/dma-imx-sdma.h |  5 +
  2 files changed, 19 insertions(+), 1 deletion(-)
 
 diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c index
 fc43603..c7ece8d 100644
 --- a/drivers/dma/imx-sdma.c
 +++ b/drivers/dma/imx-sdma.c
 @@ -323,6 +323,7 @@ struct sdma_engine {
   struct clk  *clk_ipg;
   struct clk  *clk_ahb;
   spinlock_t  channel_0_lock;
 + u32 script_number;
   struct sdma_script_start_addrs  *script_addrs;
   const struct sdma_driver_data   *drvdata;
  };
 @@ -1238,6 +1239,7 @@ static void sdma_issue_pending(struct dma_chan *chan)  }
 
  #define SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V1  34
 +#define SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V2  38
 
  static void sdma_add_scripts(struct sdma_engine *sdma,
   const struct sdma_script_start_addrs *addr) @@ -1246,7 +1248,7 
 @@
 static void sdma_add_scripts(struct sdma_engine *sdma,
   s32 *saddr_arr = (u32 *)sdma-script_addrs;
   int i;
 
 - for (i = 0; i  SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V1; i++)
 + for (i = 0; i  sdma-script_number; i++)
   if (addr_arr[i]  0)
   saddr_arr[i] = addr_arr[i];
  }
 @@ -1272,6 +1274,17 @@ static void sdma_load_firmware(const struct firmware 
 *fw,
 void *context)
   goto err_firmware;
   if (header-ram_code_start + header-ram_code_size  fw-size)
   goto err_firmware;
 + switch (header-version_major) {
 + case 1:
 + sdma-script_number = SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V1;
 + break;
 + case 2:
 + sdma-script_number = SDMA_SCRIPT_ADDRS_ARRAY_SIZE_V2;
 + break;
 + default:
 + dev_err(sdma-dev, unknown firmware version\n);
 + return;

Why return and not goto err_firmware ?

-Bharat

 + }
 
   addr = (void *)header + header-script_addrs_start;
   ram_code = (void *)header + header-ram_code_start; diff --git
 a/include/linux/platform_data/dma-imx-sdma.h 
 b/include/linux/platform_data/dma-
 imx-sdma.h
 index 3a39428..eabac4e 100644
 --- a/include/linux/platform_data/dma-imx-sdma.h
 +++ b/include/linux/platform_data/dma-imx-sdma.h
 @@ -43,6 +43,11 @@ struct sdma_script_start_addrs {
   s32 dptc_dvfs_addr;
   s32 utra_addr;
   s32 ram_code_start_addr;
 + /* End of v1 array */
 + s32 mcu_2_ssish_addr;
 + s32 ssish_2_mcu_addr;
 + s32 hdmi_dma_addr;
 + /* End of v2 array */
  };
 
  /**
 --
 1.8.4
 
 
 ___
 Linuxppc-dev mailing list
 Linuxppc-dev@lists.ozlabs.org
 https://lists.ozlabs.org/listinfo/linuxppc-dev


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20 state and altivec idle

2013-11-05 Thread Bharat Bhushan



 -Original Message-
 From: Wang Dongsheng-B40534
 Sent: Tuesday, November 05, 2013 8:40 AM
 To: Wood Scott-B07421
 Cc: Bhushan Bharat-R65777; linuxppc-dev@lists.ozlabs.org
 Subject: RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20 state and altivec
 idle
 
 
 
  -Original Message-
  From: Wood Scott-B07421
  Sent: Tuesday, November 05, 2013 5:52 AM
  To: Wang Dongsheng-B40534
  Cc: Wood Scott-B07421; Bhushan Bharat-R65777; linuxppc-
  d...@lists.ozlabs.org
  Subject: Re: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20 state and
  altivec idle
 
  On Sun, 2013-11-03 at 22:04 -0600, Wang Dongsheng-B40534 wrote:
-Original Message-
From: Wang Dongsheng-B40534
Sent: Monday, October 21, 2013 11:11 AM
To: Wood Scott-B07421
Cc: Bhushan Bharat-R65777; linuxppc-dev@lists.ozlabs.org
Subject: RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20 state
and altivec idle
   
   
   
 -Original Message-
 From: Wood Scott-B07421
 Sent: Saturday, October 19, 2013 3:22 AM
 To: Wang Dongsheng-B40534
 Cc: Bhushan Bharat-R65777; Wood Scott-B07421; linuxppc-
 d...@lists.ozlabs.org
 Subject: Re: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20
 state and altivec idle

 On Thu, 2013-10-17 at 22:02 -0500, Wang Dongsheng-B40534 wrote:
 
   -Original Message-
   From: Bhushan Bharat-R65777
   Sent: Thursday, October 17, 2013 2:46 PM
   To: Wang Dongsheng-B40534; Wood Scott-B07421
   Cc: linuxppc-dev@lists.ozlabs.org
   Subject: RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20
   state and altivec idle
  
  
  
  -Original Message-
  From: Wang Dongsheng-B40534
  Sent: Thursday, October 17, 2013 11:22 AM
  To: Bhushan Bharat-R65777; Wood Scott-B07421
  Cc: linuxppc-dev@lists.ozlabs.org
  Subject: RE: [PATCH v5 4/4] powerpc/85xx: add sysfs
  for
  pw20 state and altivec idle
 
 
 
   -Original Message-
   From: Bhushan Bharat-R65777
   Sent: Thursday, October 17, 2013 11:20 AM
   To: Wang Dongsheng-B40534; Wood Scott-B07421
   Cc: linuxppc-dev@lists.ozlabs.org
   Subject: RE: [PATCH v5 4/4] powerpc/85xx: add sysfs
   for
   pw20 state and altivec idle
  
  
  
-Original Message-
From: Wang Dongsheng-B40534
Sent: Thursday, October 17, 2013 8:16 AM
To: Bhushan Bharat-R65777; Wood Scott-B07421
Cc: linuxppc-dev@lists.ozlabs.org
Subject: RE: [PATCH v5 4/4] powerpc/85xx: add
sysfs for
pw20 state and altivec idle
   
   
   
 -Original Message-
 From: Bhushan Bharat-R65777
 Sent: Thursday, October 17, 2013 1:01 AM
 To: Wang Dongsheng-B40534; Wood Scott-B07421
 Cc: linuxppc-dev@lists.ozlabs.org
 Subject: RE: [PATCH v5 4/4] powerpc/85xx: add
 sysfs for
 pw20 state and altivec idle



  -Original Message-
  From: Wang Dongsheng-B40534
  Sent: Tuesday, October 15, 2013 2:51 PM
  To: Wood Scott-B07421
  Cc: Bhushan Bharat-R65777;
  linuxppc-dev@lists.ozlabs.org; Wang
 Dongsheng-B40534
  Subject: [PATCH v5 4/4] powerpc/85xx: add
  sysfs for
  pw20 state and
 altivec idle
 
  From: Wang Dongsheng
  dongsheng.w...@freescale.com
 
  Add a sys interface to enable/diable pw20
  state or altivec idle, and
 control the
  wait entry time.
 
  Enable/Disable interface:
  0, disable. 1, enable.
  /sys/devices/system/cpu/cpuX/pw20_state
  /sys/devices/system/cpu/cpuX/altivec_idle
 
  Set wait time interface:(Nanosecond)
  /sys/devices/system/cpu/cpuX/pw20_wait_time
  /sys/devices/system/cpu/cpuX/altivec_idle_wait
  _t
  ime
  Example: Base on TBfreq is 41MHZ.
  1~48(ns): TB[63]
  49~97(ns): TB[62]
  98~195(ns): TB[61]
  196~390(ns): TB[60]
  391~780(ns): TB[59]
  781~1560(ns): TB[58] ...
 
  Signed-off-by: Wang Dongsheng
  dongsheng.w...@freescale.com
  ---
  *v5:
  Change get_idle_ticks_bit function implementation.
 
  *v4:
  Move code from 85xx/common.c to kernel/sysfs.c.
 
  Remove has_pw20_altivec_idle function.
 
  Change wait entry_bit to wait time.

[PATCH 1/5 RFC] pci:msi: add weak function for returning msi region info

2013-10-29 Thread Bharat Bhushan

In Aperture type of IOMMU (like FSL PAMU), VFIO-iommu system need to know
the MSI region to map its window in h/w. This patch just defines the
required weak functions only and will be used by followup patches.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 drivers/pci/msi.c   |   22 ++
 include/linux/msi.h |   14 ++
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index d5f90d6..2643a29 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -67,6 +67,28 @@ int __weak arch_msi_check_device(struct pci_dev *dev, int 
nvec, int type)
return chip-check_device(chip, dev, nvec, type);
 }
 
+int __weak arch_msi_get_region_count(void)
+{
+   return 0;
+}
+
+int __weak arch_msi_get_region(int region_num, struct msi_region *region)
+{
+   return 0;
+}
+
+int msi_get_region_count(void)
+{
+   return arch_msi_get_region_count();
+}
+EXPORT_SYMBOL(msi_get_region_count);
+
+int msi_get_region(int region_num, struct msi_region *region)
+{
+   return arch_msi_get_region(region_num, region);
+}
+EXPORT_SYMBOL(msi_get_region);
+
 int __weak arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
struct msi_desc *entry;
diff --git a/include/linux/msi.h b/include/linux/msi.h
index b17ead8..0deedb4 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -51,6 +51,18 @@ struct msi_desc {
 };
 
 /*
+ * This structure is used to get
+ * - physical address
+ * - size
+ * of a msi region
+ */
+struct msi_region {
+   int region_num; /* MSI region number */
+   dma_addr_t addr; /* Address of MSI region */
+   size_t size; /* Size of MSI region */
+};
+
+/*
  * The arch hooks to setup up msi irqs. Those functions are
  * implemented as weak symbols so that they /can/ be overriden by
  * architecture specific code if needed.
@@ -64,6 +76,8 @@ void arch_restore_msi_irqs(struct pci_dev *dev, int irq);
 
 void default_teardown_msi_irqs(struct pci_dev *dev);
 void default_restore_msi_irqs(struct pci_dev *dev, int irq);
+int arch_msi_get_region_count(void);
+int arch_msi_get_region(int region_num, struct msi_region *region);
 
 struct msi_chip {
struct module *owner;
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 4/5 RFC] pci: msi: expose msi region information functions

2013-10-29 Thread Bharat Bhushan

So by now we have defined all the interfaces for getting the msi region,
this patch expose the interface to linux subsystem. These will be used by
vfio subsystem for setting up iommu for MSI interrupt of direct assignment
devices.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 include/linux/pci.h |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index da172f9..c587034 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1142,6 +1142,7 @@ struct msix_entry {
u16 entry;  /* driver uses to specify entry, OS writes */
 };
 
+struct msi_region;
 
 #ifndef CONFIG_PCI_MSI
 static inline int pci_enable_msi_block(struct pci_dev *dev, unsigned int nvec)
@@ -1184,6 +1185,16 @@ static inline int pci_msi_enabled(void)
 {
return 0;
 }
+
+static inline int msi_get_region_count(void)
+{
+   return 0;
+}
+
+static inline int msi_get_region(int region_num, struct msi_region *region)
+{
+   return 0;
+}
 #else
 int pci_enable_msi_block(struct pci_dev *dev, unsigned int nvec);
 int pci_enable_msi_block_auto(struct pci_dev *dev, unsigned int *maxvec);
@@ -1196,6 +1207,8 @@ void pci_disable_msix(struct pci_dev *dev);
 void msi_remove_pci_irq_vectors(struct pci_dev *dev);
 void pci_restore_msi_state(struct pci_dev *dev);
 int pci_msi_enabled(void);
+int msi_get_region_count(void);
+int msi_get_region(int region_num, struct msi_region *region);
 #endif
 
 #ifdef CONFIG_PCIEPORTBUS
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/5 RFC] powerpc: pci: Add arch specific msi region interface

2013-10-29 Thread Bharat Bhushan

This patch adds the interface to get the msi region information from arch
specific code. The machine spicific code is not yet defined.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/include/asm/machdep.h |8 
 arch/powerpc/kernel/msi.c  |   18 ++
 2 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h 
b/arch/powerpc/include/asm/machdep.h
index 8b48090..8d1b787 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -30,6 +30,7 @@ struct file;
 struct pci_controller;
 struct kimage;
 struct pci_host_bridge;
+struct msi_region;
 
 struct machdep_calls {
char*name;
@@ -124,6 +125,13 @@ struct machdep_calls {
int (*setup_msi_irqs)(struct pci_dev *dev,
  int nvec, int type);
void(*teardown_msi_irqs)(struct pci_dev *dev);
+
+   /* Returns the number of MSI regions (banks) */
+   int (*msi_get_region_count)(void);
+
+   /* Returns the requested region's address and size */
+   int (*msi_get_region)(int region_num,
+ struct msi_region *region);
 #endif
 
void(*restart)(char *cmd);
diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c
index 8bbc12d..1a67787 100644
--- a/arch/powerpc/kernel/msi.c
+++ b/arch/powerpc/kernel/msi.c
@@ -13,6 +13,24 @@
 
 #include asm/machdep.h
 
+int arch_msi_get_region_count(void)
+{
+   if (ppc_md.msi_get_region_count) {
+   pr_debug(msi: Using platform get_region_count routine.\n);
+   return ppc_md.msi_get_region_count();
+   }
+   return 0;
+}
+
+int arch_msi_get_region(int region_num, struct msi_region *region)
+{
+   if (ppc_md.msi_get_region) {
+   pr_debug(msi: Using platform get_region routine.\n);
+   return ppc_md.msi_get_region(region_num, region);
+   }
+   return 0;
+}
+
 int arch_msi_check_device(struct pci_dev* dev, int nvec, int type)
 {
if (!ppc_md.setup_msi_irqs || !ppc_md.teardown_msi_irqs) {
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 3/5 RFC] powerpc: msi: Extend the msi region interface to get info from fsl_msi

2013-10-29 Thread Bharat Bhushan

The FSL MSI will provide the interface to get:
  - Number of MSI regions (which is number of MSI banks for powerpc)
  - Get the region address range: Physical page which have the
address/addresses used for generating MSI interrupt
and size of the page.

These are required to create IOMMU (Freescale PAMU) mapping for
devices which are directly assigned using VFIO.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/sysdev/fsl_msi.c |   42 +++-
 arch/powerpc/sysdev/fsl_msi.h |   11 -
 2 files changed, 45 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index 77efbae..eeebbf0 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -109,6 +109,34 @@ static int fsl_msi_init_allocator(struct fsl_msi *msi_data)
return 0;
 }
 
+static int fsl_msi_get_region_count(void)
+{
+   int count = 0;
+   struct fsl_msi *msi_data;
+
+   list_for_each_entry(msi_data, msi_head, list)
+   count++;
+
+   return count;
+}
+
+static int fsl_msi_get_region(int region_num, struct msi_region *region)
+{
+   struct fsl_msi *msi_data;
+
+   list_for_each_entry(msi_data, msi_head, list) {
+   if (msi_data-bank_index == region_num) {
+   region-region_num = msi_data-bank_index;
+   /* Setting PAGE_SIZE as MSIIR is a 4 byte register */
+   region-size = PAGE_SIZE;
+   region-addr = msi_data-msiir  ~(region-size - 1);
+   return 0;
+   }
+   }
+
+   return -ENODEV;
+}
+
 static int fsl_msi_check_device(struct pci_dev *pdev, int nvec, int type)
 {
if (type == PCI_CAP_ID_MSIX)
@@ -150,7 +178,8 @@ static void fsl_compose_msi_msg(struct pci_dev *pdev, int 
hwirq,
if (reg  (len == sizeof(u64)))
address = be64_to_cpup(reg);
else
-   address = fsl_pci_immrbar_base(hose) + msi_data-msiir_offset;
+   address = fsl_pci_immrbar_base(hose) +
+  (msi_data-msiir  0xf);
 
msg-address_lo = lower_32_bits(address);
msg-address_hi = upper_32_bits(address);
@@ -393,6 +422,7 @@ static int fsl_of_msi_probe(struct platform_device *dev)
const struct fsl_msi_feature *features;
int len;
u32 offset;
+   static atomic_t bank_index = ATOMIC_INIT(-1);
 
match = of_match_device(fsl_of_msi_ids, dev-dev);
if (!match)
@@ -436,18 +466,15 @@ static int fsl_of_msi_probe(struct platform_device *dev)
dev-dev.of_node-full_name);
goto error_out;
}
-   msi-msiir_offset =
-   features-msiir_offset + (res.start  0xf);
 
/*
 * First read the MSIIR/MSIIR1 offset from dts
 * On failure use the hardcode MSIIR offset
 */
if (of_address_to_resource(dev-dev.of_node, 1, msiir))
-   msi-msiir_offset = features-msiir_offset +
-   (res.start  MSIIR_OFFSET_MASK);
+   msi-msiir = res.start + features-msiir_offset;
else
-   msi-msiir_offset = msiir.start  MSIIR_OFFSET_MASK;
+   msi-msiir = msiir.start;
}
 
msi-feature = features-fsl_pic_ip;
@@ -521,6 +548,7 @@ static int fsl_of_msi_probe(struct platform_device *dev)
}
}
 
+   msi-bank_index = atomic_inc_return(bank_index);
list_add_tail(msi-list, msi_head);
 
/* The multiple setting ppc_md.setup_msi_irqs will not harm things */
@@ -528,6 +556,8 @@ static int fsl_of_msi_probe(struct platform_device *dev)
ppc_md.setup_msi_irqs = fsl_setup_msi_irqs;
ppc_md.teardown_msi_irqs = fsl_teardown_msi_irqs;
ppc_md.msi_check_device = fsl_msi_check_device;
+   ppc_md.msi_get_region_count = fsl_msi_get_region_count;
+   ppc_md.msi_get_region = fsl_msi_get_region;
} else if (ppc_md.setup_msi_irqs != fsl_setup_msi_irqs) {
dev_err(dev-dev, Different MSI driver already installed!\n);
err = -ENODEV;
diff --git a/arch/powerpc/sysdev/fsl_msi.h b/arch/powerpc/sysdev/fsl_msi.h
index df9aa9f..a2cc5a2 100644
--- a/arch/powerpc/sysdev/fsl_msi.h
+++ b/arch/powerpc/sysdev/fsl_msi.h
@@ -31,14 +31,21 @@ struct fsl_msi {
struct irq_domain *irqhost;
 
unsigned long cascade_irq;
-
-   u32 msiir_offset; /* Offset of MSIIR, relative to start of CCSR */
+   phys_addr_t msiir; /* MSIIR Address in CCSR */
u32 ibs_shift; /* Shift of interrupt bit select */
u32 srs_shift; /* Shift of the shared interrupt register select */
void __iomem *msi_regs;
u32 feature

[PATCH 5/5 RFC] vfio: setup iova-base for msi interrupts for vfio assigned device

2013-10-29 Thread Bharat Bhushan

PAMU (FSL IOMMU) has a concept of primary window and subwindows.
Primary window corresponds to the complete guest iova address space
(including MSI space), with respect to IOMMU_API this is termed as
geometry. IOVA Base of subwindow is determined from the number of
subwindows (configurable using iommu API).
MSI I/O page must be within the geometry and maximum supported
subwindows, so MSI IO-page is setup just after guest memory iova space.

This patch is for setting up MSI iova-base for vfio devices assigned
in msi subsystem, so that when msi-message will be composed then this
configured iova will be used.

According to this design vfio will make msi_set_iova() msi-API call to
setup iova for a device. MSI will keep track of iova-base of all device
under a msi-bank. When composing the MSI address and data this list will
be traversed, if device found in the list then device used by vfio and
its iova-base will be taken from here otherwise iova-base will be taken
as before.

This is a draft patch to describe the interface to setup iova in MSI
(what Alex Williamson proposed earlier on related patchset).
Currently I have bundled all changes in one patch to take initial
review comment on design. I will divide this in multiple logical
patches once this design is accepted.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/include/asm/machdep.h |2 +
 arch/powerpc/kernel/msi.c  |   10 ++
 arch/powerpc/sysdev/fsl_msi.c  |   64 
 arch/powerpc/sysdev/fsl_msi.h  |   10 -
 drivers/pci/msi.c  |   12 +++
 include/linux/pci.h|8 
 6 files changed, 104 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h 
b/arch/powerpc/include/asm/machdep.h
index 8d1b787..e87b806 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -132,6 +132,8 @@ struct machdep_calls {
/* Returns the requested region's address and size */
int (*msi_get_region)(int region_num,
  struct msi_region *region);
+   int (*msi_set_iova)(struct pci_dev *pdev, int region_num,
+   dma_addr_t iova, bool set);
 #endif
 
void(*restart)(char *cmd);
diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c
index 1a67787..e2bd555 100644
--- a/arch/powerpc/kernel/msi.c
+++ b/arch/powerpc/kernel/msi.c
@@ -13,6 +13,16 @@
 
 #include asm/machdep.h
 
+int arch_msi_set_iova(struct pci_dev *pdev, int region_num,
+ dma_addr_t iova, bool set)
+{
+   if (ppc_md.msi_set_iova) {
+   pr_debug(msi: Using platform get_region_count routine.\n);
+   return ppc_md.msi_set_iova(pdev, region_num, iova, set);
+   }
+   return 0;
+}
+
 int arch_msi_get_region_count(void)
 {
if (ppc_md.msi_get_region_count) {
diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index eeebbf0..ad22d74 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -137,6 +137,46 @@ static int fsl_msi_get_region(int region_num, struct 
msi_region *region)
return -ENODEV;
 }
 
+static int fsl_msi_set_iova(struct pci_dev *pdev, int region_num,
+   dma_addr_t iova, bool set)
+{
+   struct fsl_msi *msi_data;
+   struct fsl_msi_device *device;
+
+   list_for_each_entry(msi_data, msi_head, list) {
+   if (msi_data-bank_index != region_num)
+   continue;
+   mutex_lock(msi_data-lock);
+   if (set) {
+   list_for_each_entry(device, msi_data-device_list, 
list) {
+   if (device-dev == pdev) {
+   device-iova = iova;
+   mutex_unlock(msi_data-lock);
+   return 0;
+   }
+   }
+
+   device = kzalloc(sizeof(struct fsl_msi_device), 
GFP_KERNEL);
+   device-dev = pdev;
+   device-iova = iova;
+   list_add_tail(device-list, msi_data-device_list);
+   } else {
+   list_for_each_entry(device, msi_data-device_list, 
list) {
+   if (device-dev == pdev) {
+   list_del(device-list);
+   kfree(device);
+   mutex_unlock(msi_data-lock);
+   return 0;
+   }
+   }
+   }
+
+   mutex_unlock(msi_data-lock);
+   break;
+   }
+   return 0;
+}
+
 static int fsl_msi_check_device(struct pci_dev *pdev, int nvec, int type

[PATCH 0/5 RFC] vfio/pci: add interface to for MSI support with FSL PAMU

2013-10-29 Thread Bharat Bhushan

From: Bharat Bhushan bharat.bhus...@freescale.com

PAMU (FSL IOMMU) has a concept of primary window and subwindows.
Primary window corresponds to the complete guest iova address space
(including MSI space), with respect to IOMMU_API this is termed as
geometry. IOVA Base of subwindow is determined from the number of
subwindows (configurable using iommu API).
MSI I/O page must be within the geometry and maximum supported
subwindows, so MSI IO-page is setup just after guest memory iova space.

So first four patches are for defining the interface to get:
  - Number of MSI regions (which is number of MSI banks for powerpc)
  - MSI-region address range: Physical page which have the
address/addresses used for generating MSI interrupt
and size of the page.

Last Patch is for setting up MSI iova-base for vfio devices assigned
in msi subsystem, so that when msi-message will be composed then this
configured iova will be used. Earlier we were using iommu interface
for getting the configured iova which was not currect and
Alex Williamson suggeested this type of interface.

Bharat Bhushan (5):
  pci:msi: add weak function for returning msi region info
  powerpc: pci: Add arch specific msi region interface
  powerpc: msi: Extend the msi region interface to get info from
fsl_msi
  pci: msi: expose msi region information functions
  vfio: setup iova-base for msi interrupts for vfio assigned device

 arch/powerpc/include/asm/machdep.h |   10 
 arch/powerpc/kernel/msi.c  |   28 ++
 arch/powerpc/sysdev/fsl_msi.c  |  106 ++--
 arch/powerpc/sysdev/fsl_msi.h  |   19 ++-
 drivers/pci/msi.c  |   34 
 include/linux/msi.h|   14 +
 include/linux/pci.h|   21 +++
 7 files changed, 223 insertions(+), 9 deletions(-)


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/4] powerpc: Added __cmpdi2 for signed 64bit comparision

2013-10-08 Thread Bharat Bhushan

This was missing on powerpc and I am getting compilation error
drivers/vfio/pci/vfio_pci_rdwr.c:193: undefined reference to `__cmpdi2'
drivers/vfio/pci/vfio_pci_rdwr.c:193: undefined reference to `__cmpdi2'

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kernel/misc_32.S   |   14 ++
 arch/powerpc/kernel/ppc_ksyms.c |2 ++
 2 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 777d999..7c0eec2 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -644,6 +644,20 @@ _GLOBAL(__lshrdi3)
blr
 
 /*
+ * 64-bit comparison: __cmpdi2(s64 a, s64 b)
+ * Returns 0 if a  b, 1 if a == b, 2 if a  b.
+ */
+_GLOBAL(__cmpdi2)
+   cmpwr3,r5
+   li  r3,1
+   bne 1f
+   cmplw   r4,r6
+   beqlr
+1: li  r3,0
+   bltlr
+   li  r3,2
+   blr
+/*
  * 64-bit comparison: __ucmpdi2(u64 a, u64 b)
  * Returns 0 if a  b, 1 if a == b, 2 if a  b.
  */
diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
index 21646db..5674c00 100644
--- a/arch/powerpc/kernel/ppc_ksyms.c
+++ b/arch/powerpc/kernel/ppc_ksyms.c
@@ -143,6 +143,8 @@ EXPORT_SYMBOL(__ashldi3);
 EXPORT_SYMBOL(__lshrdi3);
 int __ucmpdi2(unsigned long long, unsigned long long);
 EXPORT_SYMBOL(__ucmpdi2);
+int __cmpdi2(long long, long long);
+EXPORT_SYMBOL(__cmpdi2);
 #endif
 long long __bswapdi2(long long);
 EXPORT_SYMBOL(__bswapdi2);
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc: Added __cmpdi2 for signed 64bit comparision

2013-10-08 Thread Bharat Bhushan

This was missing on powerpc and I am getting compilation error
drivers/vfio/pci/vfio_pci_rdwr.c:193: undefined reference to `__cmpdi2'
drivers/vfio/pci/vfio_pci_rdwr.c:193: undefined reference to `__cmpdi2'

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kernel/misc_32.S   |   14 ++
 arch/powerpc/kernel/ppc_ksyms.c |2 ++
 2 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 777d999..7c0eec2 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -644,6 +644,20 @@ _GLOBAL(__lshrdi3)
blr
 
 /*
+ * 64-bit comparison: __cmpdi2(s64 a, s64 b)
+ * Returns 0 if a  b, 1 if a == b, 2 if a  b.
+ */
+_GLOBAL(__cmpdi2)
+   cmpwr3,r5
+   li  r3,1
+   bne 1f
+   cmplw   r4,r6
+   beqlr
+1: li  r3,0
+   bltlr
+   li  r3,2
+   blr
+/*
  * 64-bit comparison: __ucmpdi2(u64 a, u64 b)
  * Returns 0 if a  b, 1 if a == b, 2 if a  b.
  */
diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
index 21646db..5674c00 100644
--- a/arch/powerpc/kernel/ppc_ksyms.c
+++ b/arch/powerpc/kernel/ppc_ksyms.c
@@ -143,6 +143,8 @@ EXPORT_SYMBOL(__ashldi3);
 EXPORT_SYMBOL(__lshrdi3);
 int __ucmpdi2(unsigned long long, unsigned long long);
 EXPORT_SYMBOL(__ucmpdi2);
+int __cmpdi2(long long, long long);
+EXPORT_SYMBOL(__cmpdi2);
 #endif
 long long __bswapdi2(long long);
 EXPORT_SYMBOL(__bswapdi2);
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 5/6 v7] kvm: booke: clear host tlb reference flag on guest tlb invalidation

2013-09-22 Thread Bharat Bhushan

On booke, struct tlbe_ref contains host tlb mapping information
(pfn: for guest-pfn to pfn, flags: attribute associated with this mapping)
for a guest tlb entry. So when a guest creates a TLB entry then
struct tlbe_ref is set to point to valid pfn and set attributes in
flags field of the above said structure. When a guest TLB entry is
invalidated then flags field of corresponding struct tlbe_ref is
updated to point that this is no more valid, also we selectively clear
some other attribute bits, example: if E500_TLB_BITMAP was set then we clear
E500_TLB_BITMAP, if E500_TLB_TLB0 is set then we clear this.

Ideally we should clear complete flags as this entry is invalid and does not
have anything to re-used. The other part of the problem is that when we use
the same entry again then also we do not clear (started doing or-ing etc).

So far it was working because the selectively clearing mentioned above
actually clears flags what was set during TLB mapping. But the problem
starts coming when we add more attributes to this then we need to selectively
clear them and which is not needed.

This patch we do both
- Clear flags when invalidating;
- Clear flags when reusing same entry later

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v6-7
 - Comment re-phrased

v5-v6
 - Reordered the flag clearing steps as per comment on v5

v4-v5
 - New change

 arch/powerpc/kvm/e500_mmu_host.c |   16 
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..7a41a93 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -230,15 +230,15 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 
*vcpu_e500, int tlbsel,
ref-flags = ~(E500_TLB_TLB0 | E500_TLB_VALID);
}
 
-   /* Already invalidated in between */
-   if (!(ref-flags  E500_TLB_VALID))
-   return;
-
-   /* Guest tlbe is backed by at most one host tlbe per shadow pid. */
-   kvmppc_e500_tlbil_one(vcpu_e500, gtlbe);
+   /*
+* If TLB entry is still valid then it's a TLB0 entry, and thus
+* backed by at most one host tlbe per shadow pid
+*/
+   if (ref-flags  E500_TLB_VALID)
+   kvmppc_e500_tlbil_one(vcpu_e500, gtlbe);
 
/* Mark the TLB as not backed by the host anymore */
-   ref-flags = ~E500_TLB_VALID;
+   ref-flags = 0;
 }
 
 static inline int tlbe_is_writable(struct kvm_book3e_206_tlb_entry *tlbe)
@@ -251,7 +251,7 @@ static inline void kvmppc_e500_ref_setup(struct tlbe_ref 
*ref,
 pfn_t pfn)
 {
ref-pfn = pfn;
-   ref-flags |= E500_TLB_VALID;
+   ref-flags = E500_TLB_VALID;
 
if (tlbe_is_writable(gtlbe))
kvm_set_pfn_dirty(pfn);
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/6 v5] kvm: powerpc: use cache attributes from linux pte

2013-09-19 Thread Bharat Bhushan

From: Bharat Bhushan bharat.bhus...@freescale.com

First patch is a typo fix where book3e define _PAGE_LENDIAN while it
should be defined as _PAGE_ENDIAN. This seems to show that this is never 
exercised :-)

Second and third patch is to allow guest controlling G-Guarded and E-Endian 
TLB attributes respectively.

Fourth patch is moving functions/logic in common code so they can be used on 
booke also.

Fifth and Sixth patch is actually setting caching attributes (TLB.WIMGE) using 
corresponding Linux pte.

v3-v5
 - Fix tlb-reference-flag clearing issue (patch 4/6)
 - There was a patch (4/6 powerpc: move linux pte/hugepte search to more 
generic file)
   in the last series of this patchset which was moving pte/hugepte search 
functions to
   generic file. That patch is no more needed as some other patch is already 
applied to fix that :)

v2-v3
 - now lookup_linux_pte() only have pte search logic and it does not
   set any access flags in pte. There is already a function for setting
   access flag which will be called explicitly where needed.
   On booke we only need to search for pte to get WIMG.

v1-v2
 - Earlier caching attributes (WIMGE) were set based of page is RAM or not
   But now we get these attributes from corresponding Linux PTE.

Bharat Bhushan (6):
  powerpc: book3e: _PAGE_LENDIAN must be _PAGE_ENDIAN
  kvm: powerpc: allow guest control E attribute in mas2
  kvm: powerpc: allow guest control G attribute in mas2
  kvm: powerpc: keep only pte search logic in lookup_linux_pte
  kvm: booke: clear host tlb reference flag on guest tlb invalidation
  kvm: powerpc: use caching attributes as per linux pte

 arch/powerpc/include/asm/kvm_host.h   |2 +-
 arch/powerpc/include/asm/pgtable.h|   24 
 arch/powerpc/include/asm/pte-book3e.h |2 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   |   36 
 arch/powerpc/kvm/booke.c  |2 +-
 arch/powerpc/kvm/e500.h   |   10 --
 arch/powerpc/kvm/e500_mmu_host.c  |   50 +++--
 7 files changed, 74 insertions(+), 52 deletions(-)


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/6 v5] powerpc: book3e: _PAGE_LENDIAN must be _PAGE_ENDIAN

2013-09-19 Thread Bharat Bhushan

For booke3e _PAGE_ENDIAN is not defined. Infact what is defined
is _PAGE_LENDIAN which is wrong and that should be _PAGE_ENDIAN.
There are no compilation errors as
arch/powerpc/include/asm/pte-common.h defines _PAGE_ENDIAN to 0
as it is not defined anywhere.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v5
 - no change

 arch/powerpc/include/asm/pte-book3e.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/pte-book3e.h 
b/arch/powerpc/include/asm/pte-book3e.h
index 0156702..576ad88 100644
--- a/arch/powerpc/include/asm/pte-book3e.h
+++ b/arch/powerpc/include/asm/pte-book3e.h
@@ -40,7 +40,7 @@
 #define _PAGE_U1   0x01
 #define _PAGE_U0   0x02
 #define _PAGE_ACCESSED 0x04
-#define _PAGE_LENDIAN  0x08
+#define _PAGE_ENDIAN   0x08
 #define _PAGE_GUARDED  0x10
 #define _PAGE_COHERENT 0x20 /* M: enforce memory coherence */
 #define _PAGE_NO_CACHE 0x40 /* I: cache inhibit */
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 3/6 v5] kvm: powerpc: allow guest control G attribute in mas2

2013-09-19 Thread Bharat Bhushan

G bit in MAS2 indicates whether the page is Guarded.
There is no reason to stop guest setting  G, so allow him.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v5
 - no change
 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 277cb18..4fd9650 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
- (MAS2_X0 | MAS2_X1 | MAS2_E)
+ (MAS2_X0 | MAS2_X1 | MAS2_E | MAS2_G)
 #define MAS3_ATTRIB_MASK \
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/6 v5] kvm: powerpc: allow guest control E attribute in mas2

2013-09-19 Thread Bharat Bhushan

E bit in MAS2 bit indicates whether the page is accessed
in Little-Endian or Big-Endian byte order.
There is no reason to stop guest setting  E, so allow him.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v5
 - no change
 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index c2e5e98..277cb18 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
- (MAS2_X0 | MAS2_X1)
+ (MAS2_X0 | MAS2_X1 | MAS2_E)
 #define MAS3_ATTRIB_MASK \
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 5/6 v5] kvm: booke: clear host tlb reference flag on guest tlb invalidation

2013-09-19 Thread Bharat Bhushan

On booke, struct tlbe_ref contains host tlb mapping information
(pfn: for guest-pfn to pfn, flags: attribute associated with this mapping)
for a guest tlb entry. So when a guest creates a TLB entry then
struct tlbe_ref is set to point to valid pfn and set attributes in
flags field of the above said structure. When a guest TLB entry is
invalidated then flags field of corresponding struct tlbe_ref is
updated to point that this is no more valid, also we selectively clear
some other attribute bits, example: if E500_TLB_BITMAP was set then we clear
E500_TLB_BITMAP, if E500_TLB_TLB0 is set then we clear this.

Ideally we should clear complete flags as this entry is invalid and does not
have anything to re-used. The other part of the problem is that when we use
the same entry again then also we do not clear (started doing or-ing etc).

So far it was working because the selectively clearing mentioned above
actually clears flags what was set during TLB mapping. But the problem
starts coming when we add more attributes to this then we need to selectively
clear them and which is not needed.

This patch we do both
- Clear flags when invalidating;
- Clear flags when reusing same entry later

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v3- v5
 - New patch (found this issue when doing vfio-pci development)

 arch/powerpc/kvm/e500_mmu_host.c |   12 +++-
 1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..60f5a3c 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -217,7 +217,8 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 
*vcpu_e500, int tlbsel,
}
mb();
vcpu_e500-g2h_tlb1_map[esel] = 0;
-   ref-flags = ~(E500_TLB_BITMAP | E500_TLB_VALID);
+   /* Clear flags as TLB is not backed by the host anymore */
+   ref-flags = 0;
local_irq_restore(flags);
}
 
@@ -227,7 +228,8 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 
*vcpu_e500, int tlbsel,
 * rarely and is not worth optimizing. Invalidate everything.
 */
kvmppc_e500_tlbil_all(vcpu_e500);
-   ref-flags = ~(E500_TLB_TLB0 | E500_TLB_VALID);
+   /* Clear flags as TLB is not backed by the host anymore */
+   ref-flags = 0;
}
 
/* Already invalidated in between */
@@ -237,8 +239,8 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 
*vcpu_e500, int tlbsel,
/* Guest tlbe is backed by at most one host tlbe per shadow pid. */
kvmppc_e500_tlbil_one(vcpu_e500, gtlbe);
 
-   /* Mark the TLB as not backed by the host anymore */
-   ref-flags = ~E500_TLB_VALID;
+   /* Clear flags as TLB is not backed by the host anymore */
+   ref-flags = 0;
 }
 
 static inline int tlbe_is_writable(struct kvm_book3e_206_tlb_entry *tlbe)
@@ -251,7 +253,7 @@ static inline void kvmppc_e500_ref_setup(struct tlbe_ref 
*ref,
 pfn_t pfn)
 {
ref-pfn = pfn;
-   ref-flags |= E500_TLB_VALID;
+   ref-flags = E500_TLB_VALID;
 
if (tlbe_is_writable(gtlbe))
kvm_set_pfn_dirty(pfn);
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 4/6 v5] kvm: powerpc: keep only pte search logic in lookup_linux_pte

2013-09-19 Thread Bharat Bhushan

lookup_linux_pte() was searching for a pte and also sets access
flags is writable. This function now searches only pte while
access flag setting is done explicitly.

This pte lookup is not kvm specific, so moved to common code (asm/pgtable.h)
My Followup patch will use this on booke.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v4-v5
 - No change

 arch/powerpc/include/asm/pgtable.h  |   24 +++
 arch/powerpc/kvm/book3s_hv_rm_mmu.c |   36 +++---
 2 files changed, 36 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index 7d6eacf..3a5de5c 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -223,6 +223,30 @@ extern int gup_hugepte(pte_t *ptep, unsigned long sz, 
unsigned long addr,
 #endif
 pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
 unsigned *shift);
+
+static inline pte_t *lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
+unsigned long *pte_sizep)
+{
+   pte_t *ptep;
+   unsigned long ps = *pte_sizep;
+   unsigned int shift;
+
+   ptep = find_linux_pte_or_hugepte(pgdir, hva, shift);
+   if (!ptep)
+   return __pte(0);
+   if (shift)
+   *pte_sizep = 1ul  shift;
+   else
+   *pte_sizep = PAGE_SIZE;
+
+   if (ps  *pte_sizep)
+   return __pte(0);
+
+   if (!pte_present(*ptep))
+   return __pte(0);
+
+   return ptep;
+}
 #endif /* __ASSEMBLY__ */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 45e30d6..74fa7f8 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -134,25 +134,6 @@ static void remove_revmap_chain(struct kvm *kvm, long 
pte_index,
unlock_rmap(rmap);
 }
 
-static pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
- int writing, unsigned long *pte_sizep)
-{
-   pte_t *ptep;
-   unsigned long ps = *pte_sizep;
-   unsigned int hugepage_shift;
-
-   ptep = find_linux_pte_or_hugepte(pgdir, hva, hugepage_shift);
-   if (!ptep)
-   return __pte(0);
-   if (hugepage_shift)
-   *pte_sizep = 1ul  hugepage_shift;
-   else
-   *pte_sizep = PAGE_SIZE;
-   if (ps  *pte_sizep)
-   return __pte(0);
-   return kvmppc_read_update_linux_pte(ptep, writing, hugepage_shift);
-}
-
 static inline void unlock_hpte(unsigned long *hpte, unsigned long hpte_v)
 {
asm volatile(PPC_RELEASE_BARRIER  : : : memory);
@@ -173,6 +154,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
unsigned long is_io;
unsigned long *rmap;
pte_t pte;
+   pte_t *ptep;
unsigned int writing;
unsigned long mmu_seq;
unsigned long rcbits;
@@ -231,8 +213,9 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
 
/* Look up the Linux PTE for the backing page */
pte_size = psize;
-   pte = lookup_linux_pte(pgdir, hva, writing, pte_size);
-   if (pte_present(pte)) {
+   ptep = lookup_linux_pte(pgdir, hva, pte_size);
+   if (pte_present(pte_val(*ptep))) {
+   pte = kvmppc_read_update_linux_pte(ptep, writing);
if (writing  !pte_write(pte))
/* make the actual HPTE be read-only */
ptel = hpte_make_readonly(ptel);
@@ -661,15 +644,20 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned 
long flags,
struct kvm_memory_slot *memslot;
pgd_t *pgdir = vcpu-arch.pgdir;
pte_t pte;
+   pte_t *ptep;
 
psize = hpte_page_size(v, r);
gfn = ((r  HPTE_R_RPN)  ~(psize - 1))  PAGE_SHIFT;
memslot = __gfn_to_memslot(kvm_memslots(kvm), gfn);
if (memslot) {
hva = __gfn_to_hva_memslot(memslot, gfn);
-   pte = lookup_linux_pte(pgdir, hva, 1, psize);
-   if (pte_present(pte)  !pte_write(pte))
-   r = hpte_make_readonly(r);
+   ptep = lookup_linux_pte(pgdir, hva, psize);
+   if (pte_present(pte_val(*ptep))) {
+   pte = kvmppc_read_update_linux_pte(ptep,
+  1);
+   if (pte_present(pte)  !pte_write(pte))
+   r = hpte_make_readonly(r

[PATCH 6/6 v5] kvm: powerpc: use caching attributes as per linux pte

2013-09-19 Thread Bharat Bhushan

KVM uses same WIM tlb attributes as the corresponding qemu pte.
For this we now search the linux pte for the requested page and
get these cache caching/coherency attributes from pte.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v4-v5
 - No change

 arch/powerpc/include/asm/kvm_host.h |2 +-
 arch/powerpc/kvm/booke.c|2 +-
 arch/powerpc/kvm/e500.h |8 --
 arch/powerpc/kvm/e500_mmu_host.c|   38 --
 4 files changed, 29 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 9741bf0..775f0e8 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -538,6 +538,7 @@ struct kvm_vcpu_arch {
 #endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
+   pgd_t *pgdir;
 
u8 io_gpr; /* GPR used as IO source/target */
u8 mmio_is_bigendian;
@@ -595,7 +596,6 @@ struct kvm_vcpu_arch {
struct list_head run_list;
struct task_struct *run_task;
struct kvm_run *kvm_run;
-   pgd_t *pgdir;
 
spinlock_t vpa_update_lock;
struct kvmppc_vpa vpa;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 17722d8..4171c7d 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -695,7 +695,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
 
kvmppc_load_guest_fp(vcpu);
 #endif
-
+   vcpu-arch.pgdir = current-mm-pgd;
kvmppc_fix_ee_before_entry();
 
ret = __kvmppc_vcpu_run(kvm_run, vcpu);
diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 4fd9650..a326178 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -31,11 +31,13 @@ enum vcpu_ftr {
 #define E500_TLB_NUM   2
 
 /* entry is mapped somewhere in host TLB */
-#define E500_TLB_VALID (1  0)
+#define E500_TLB_VALID (1  31)
 /* TLB1 entry is mapped by host TLB1, tracked by bitmaps */
-#define E500_TLB_BITMAP(1  1)
+#define E500_TLB_BITMAP(1  30)
 /* TLB1 entry is mapped by host TLB0 */
-#define E500_TLB_TLB0  (1  2)
+#define E500_TLB_TLB0  (1  29)
+/* bits [6-5] MAS2_X1 and MAS2_X0 and [4-0] bits for WIMGE */
+#define E500_TLB_MAS2_ATTR (0x7f)
 
 struct tlbe_ref {
pfn_t pfn;  /* valid only for TLB0, except briefly */
diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 60f5a3c..654c368 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,15 +64,6 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
usermode)
return mas3;
 }
 
-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
-{
-#ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
-#endif
-}
-
 /*
  * writing shadow tlb entry to host TLB
  */
@@ -250,10 +241,12 @@ static inline int tlbe_is_writable(struct 
kvm_book3e_206_tlb_entry *tlbe)
 
 static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,
 struct kvm_book3e_206_tlb_entry *gtlbe,
-pfn_t pfn)
+pfn_t pfn, unsigned int wimg)
 {
ref-pfn = pfn;
ref-flags = E500_TLB_VALID;
+   /* Use guest supplied MAS2_G and MAS2_E */
+   ref-flags |= (gtlbe-mas2  MAS2_ATTRIB_MASK) | wimg;
 
if (tlbe_is_writable(gtlbe))
kvm_set_pfn_dirty(pfn);
@@ -314,8 +307,7 @@ static void kvmppc_e500_setup_stlbe(
 
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
-   stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+   stlbe-mas2 = (gvaddr  MAS2_EPN) | (ref-flags  E500_TLB_MAS2_ATTR);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
@@ -334,6 +326,10 @@ static inline int kvmppc_e500_shadow_map(struct 
kvmppc_vcpu_e500 *vcpu_e500,
unsigned long hva;
int pfnmap = 0;
int tsize = BOOK3E_PAGESZ_4K;
+   unsigned long tsize_pages = 0;
+   pte_t *ptep;
+   unsigned int wimg = 0;
+   pgd_t *pgdir;
 
/*
 * Translate guest physical to true physical, acquiring
@@ -396,7 +392,7 @@ static inline int kvmppc_e500_shadow_map(struct 
kvmppc_vcpu_e500 *vcpu_e500,
 */
 
for (; tsize  BOOK3E_PAGESZ_4K; tsize -= 2) {
-   unsigned long gfn_start, gfn_end, tsize_pages;
+   unsigned long gfn_start, gfn_end;
tsize_pages = 1  (tsize - 2);
 
gfn_start = gfn  ~(tsize_pages - 1);
@@ -438,9 +434,10 @@ static inline int

[PATCH 1/7] powerpc: Add interface to get msi region information

2013-09-19 Thread Bharat Bhushan

This patch adds interface to get following information
  - Number of MSI regions (which is number of MSI banks for powerpc).
  - Get the region address range: Physical page which have the
 address/addresses used for generating MSI interrupt
 and size of the page.

These are required to create IOMMU (Freescale PAMU) mapping for
devices which are directly assigned using VFIO.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/include/asm/machdep.h |8 +++
 arch/powerpc/include/asm/pci.h |2 +
 arch/powerpc/kernel/msi.c  |   18 
 arch/powerpc/sysdev/fsl_msi.c  |   39 +--
 arch/powerpc/sysdev/fsl_msi.h  |   11 -
 drivers/pci/msi.c  |   26 
 include/linux/msi.h|8 +++
 include/linux/pci.h|   13 
 8 files changed, 120 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h 
b/arch/powerpc/include/asm/machdep.h
index 8b48090..8d1b787 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -30,6 +30,7 @@ struct file;
 struct pci_controller;
 struct kimage;
 struct pci_host_bridge;
+struct msi_region;
 
 struct machdep_calls {
char*name;
@@ -124,6 +125,13 @@ struct machdep_calls {
int (*setup_msi_irqs)(struct pci_dev *dev,
  int nvec, int type);
void(*teardown_msi_irqs)(struct pci_dev *dev);
+
+   /* Returns the number of MSI regions (banks) */
+   int (*msi_get_region_count)(void);
+
+   /* Returns the requested region's address and size */
+   int (*msi_get_region)(int region_num,
+ struct msi_region *region);
 #endif
 
void(*restart)(char *cmd);
diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
index 6653f27..e575349 100644
--- a/arch/powerpc/include/asm/pci.h
+++ b/arch/powerpc/include/asm/pci.h
@@ -117,6 +117,8 @@ extern int pci_proc_domain(struct pci_bus *bus);
 #define arch_setup_msi_irqs arch_setup_msi_irqs
 #define arch_teardown_msi_irqs arch_teardown_msi_irqs
 #define arch_msi_check_device arch_msi_check_device
+#define arch_msi_get_region_count arch_msi_get_region_count
+#define arch_msi_get_region arch_msi_get_region
 
 struct vm_area_struct;
 /* Map a range of PCI memory or I/O space for a device into user space */
diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c
index 8bbc12d..1a67787 100644
--- a/arch/powerpc/kernel/msi.c
+++ b/arch/powerpc/kernel/msi.c
@@ -13,6 +13,24 @@
 
 #include asm/machdep.h
 
+int arch_msi_get_region_count(void)
+{
+   if (ppc_md.msi_get_region_count) {
+   pr_debug(msi: Using platform get_region_count routine.\n);
+   return ppc_md.msi_get_region_count();
+   }
+   return 0;
+}
+
+int arch_msi_get_region(int region_num, struct msi_region *region)
+{
+   if (ppc_md.msi_get_region) {
+   pr_debug(msi: Using platform get_region routine.\n);
+   return ppc_md.msi_get_region(region_num, region);
+   }
+   return 0;
+}
+
 int arch_msi_check_device(struct pci_dev* dev, int nvec, int type)
 {
if (!ppc_md.setup_msi_irqs || !ppc_md.teardown_msi_irqs) {
diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index ab02db3..ed045cb 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -96,6 +96,34 @@ static int fsl_msi_init_allocator(struct fsl_msi *msi_data)
return 0;
 }
 
+static int fsl_msi_get_region_count(void)
+{
+   int count = 0;
+   struct fsl_msi *msi_data;
+
+   list_for_each_entry(msi_data, msi_head, list)
+   count++;
+
+   return count;
+}
+
+static int fsl_msi_get_region(int region_num, struct msi_region *region)
+{
+   struct fsl_msi *msi_data;
+
+   list_for_each_entry(msi_data, msi_head, list) {
+   if (msi_data-bank_index == region_num) {
+   region-region_num = msi_data-bank_index;
+   /* Setting PAGE_SIZE as MSIIR is a 4 byte register */
+   region-size = PAGE_SIZE;
+   region-addr = msi_data-msiir  ~(region-size - 1);
+   return 0;
+   }
+   }
+
+   return -ENODEV;
+}
+
 static int fsl_msi_check_device(struct pci_dev *pdev, int nvec, int type)
 {
if (type == PCI_CAP_ID_MSIX)
@@ -137,7 +165,8 @@ static void fsl_compose_msi_msg(struct pci_dev *pdev, int 
hwirq,
if (reg  (len == sizeof(u64)))
address = be64_to_cpup(reg);
else
-   address = fsl_pci_immrbar_base(hose) + msi_data-msiir_offset;
+   address = fsl_pci_immrbar_base(hose) +
+  (msi_data-msiir  0xf);
 
msg

[PATCH 3/7] fsl iommu: add get_dev_iommu_domain

2013-09-19 Thread Bharat Bhushan

From: Bharat Bhushan bharat.bhus...@freescale.com

returns the iommu_domain of the requested device for fsl pamu.

Use PCI controller dev struct for pci devices as current LIODN schema
assign LIODN to PCI controller not PCI device. This will be corrected
with proper LIODN schema.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 drivers/iommu/fsl_pamu_domain.c |   30 ++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
index 14d803a..1d0dfe3 100644
--- a/drivers/iommu/fsl_pamu_domain.c
+++ b/drivers/iommu/fsl_pamu_domain.c
@@ -1140,6 +1140,35 @@ static u32 fsl_pamu_get_windows(struct iommu_domain 
*domain)
return dma_domain-win_cnt;
 }
 
+static struct iommu_domain *fsl_get_dev_domain(struct device *dev)
+{
+   struct pci_controller *pci_ctl;
+   struct device_domain_info *info;
+   struct pci_dev *pdev;
+
+   /*
+* Use PCI controller dev struct for pci devices as current
+* LIODN schema assign LIODN to PCI controller not PCI device
+* This should get corrected with proper LIODN schema.
+*/
+   if (dev-bus == pci_bus_type) {
+   pdev = to_pci_dev(dev);
+   pci_ctl = pci_bus_to_host(pdev-bus);
+   /*
+* make dev point to pci controller device
+* so we can get the LIODN programmed by
+* u-boot.
+*/
+   dev = pci_ctl-parent;
+   }
+
+   info = dev-archdata.iommu_domain;
+   if (info  info-domain)
+   return info-domain-iommu_domain;
+
+   return NULL;
+}
+
 static struct iommu_ops fsl_pamu_ops = {
.domain_init= fsl_pamu_domain_init,
.domain_destroy = fsl_pamu_domain_destroy,
@@ -1155,6 +1184,7 @@ static struct iommu_ops fsl_pamu_ops = {
.domain_get_attr = fsl_pamu_get_domain_attr,
.add_device = fsl_pamu_add_device,
.remove_device  = fsl_pamu_remove_device,
+   .get_dev_iommu_domain = fsl_get_dev_domain,
 };
 
 int pamu_domain_init()
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/7] iommu: add api to get iommu_domain of a device

2013-09-19 Thread Bharat Bhushan

This api return the iommu domain to which the device is attached.
The iommu_domain is required for making API calls related to iommu.
Follow up patches which use this API to know iommu maping.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 drivers/iommu/iommu.c |   10 ++
 include/linux/iommu.h |7 +++
 2 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index fbe9ca7..6ac5f50 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -696,6 +696,16 @@ void iommu_detach_device(struct iommu_domain *domain, 
struct device *dev)
 }
 EXPORT_SYMBOL_GPL(iommu_detach_device);
 
+struct iommu_domain *iommu_get_dev_domain(struct device *dev)
+{
+   struct iommu_ops *ops = dev-bus-iommu_ops;
+
+   if (unlikely(ops == NULL || ops-get_dev_iommu_domain == NULL))
+   return NULL;
+
+   return ops-get_dev_iommu_domain(dev);
+}
+EXPORT_SYMBOL_GPL(iommu_get_dev_domain);
 /*
  * IOMMU groups are really the natrual working unit of the IOMMU, but
  * the IOMMU API works on domains and devices.  Bridge that gap by
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 7ea319e..fa046bd 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -127,6 +127,7 @@ struct iommu_ops {
int (*domain_set_windows)(struct iommu_domain *domain, u32 w_count);
/* Get the numer of window per domain */
u32 (*domain_get_windows)(struct iommu_domain *domain);
+   struct iommu_domain *(*get_dev_iommu_domain)(struct device *dev);
 
unsigned long pgsize_bitmap;
 };
@@ -190,6 +191,7 @@ extern int iommu_domain_window_enable(struct iommu_domain 
*domain, u32 wnd_nr,
  phys_addr_t offset, u64 size,
  int prot);
 extern void iommu_domain_window_disable(struct iommu_domain *domain, u32 
wnd_nr);
+extern struct iommu_domain *iommu_get_dev_domain(struct device *dev);
 /**
  * report_iommu_fault() - report about an IOMMU fault to the IOMMU framework
  * @domain: the iommu domain where the fault has happened
@@ -284,6 +286,11 @@ static inline void iommu_domain_window_disable(struct 
iommu_domain *domain,
 {
 }
 
+static inline struct iommu_domain *iommu_get_dev_domain(struct device *dev)
+{
+   return NULL;
+}
+
 static inline phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, 
dma_addr_t iova)
 {
return 0;
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/7] vfio-pci: add support for Freescale IOMMU (PAMU)

2013-09-19 Thread Bharat Bhushan

From: Bharat Bhushan bharat.bhus...@freescale.com

This patchset adds support for vfio-pci with Freescale
IOMMU (PAMU- Peripheral Access Management Unit)

The Freescale PAMU is an aperture-based IOMMU with the following
characteristics.  Each device has an entry in a table in memory
describing the iova-phys mapping. The mapping has:
  -an overall aperture that is power of 2 sized, and has a start iova that
   is naturally aligned
  -has 1 or more windows within the aperture
  -number of windows must be power of 2, max is 256
  -size of each window is determined by aperture size / # of windows
  -iova of each window is determined by aperture start iova / # of windows
  -the mapped region in each window can be different than
   the window size...mapping must power of 2
  -physical address of the mapping must be naturally aligned
   with the mapping size

Because of some of above said limitations we need to set limited aperture 
window which will have space for MSI address mapping. So we create space
for MSI windows just after the IOVA (guest memory).
First 4 patches in this patchset are for setting up MSI window and MSI address
at device accordingly.

Fifth patch resolves compilation error.
Sixth patch moves some common functions in a separate file so that they can be
used by FSL_PAMU implementation (next patch uses this). These will be used 
later for
iommu-none implementation. I believe we can do more of this but will take step 
by step.

Finally the seventh patch actually adds the support for FSL-PAMU :)

Bharat Bhushan (7):
  powerpc: Add interface to get msi region information
  iommu: add api to get iommu_domain of a device
  fsl iommu: add get_dev_iommu_domain
  powerpc: translate msi addr to iova if iommu is in use
  iommu: supress loff_t compilation error on powerpc
  vfio: moving some functions in common file
  vfio pci: Add vfio iommu implementation for FSL_PAMU

 arch/powerpc/include/asm/machdep.h |8 +
 arch/powerpc/include/asm/pci.h |2 +
 arch/powerpc/kernel/msi.c  |   18 +
 arch/powerpc/sysdev/fsl_msi.c  |   95 -
 arch/powerpc/sysdev/fsl_msi.h  |   11 +-
 drivers/iommu/fsl_pamu_domain.c|   30 ++
 drivers/iommu/iommu.c  |   10 +
 drivers/pci/msi.c  |   26 +
 drivers/vfio/Kconfig   |6 +
 drivers/vfio/Makefile  |5 +-
 drivers/vfio/pci/vfio_pci_rdwr.c   |3 +-
 drivers/vfio/vfio_iommu_common.c   |  235 +
 drivers/vfio/vfio_iommu_common.h   |   30 ++
 drivers/vfio/vfio_iommu_fsl_pamu.c |  952 
 drivers/vfio/vfio_iommu_type1.c|  206 +
 include/linux/iommu.h  |7 +
 include/linux/msi.h|8 +
 include/linux/pci.h|   13 +
 include/uapi/linux/vfio.h  |  100 
 19 files changed, 1550 insertions(+), 215 deletions(-)
 create mode 100644 drivers/vfio/vfio_iommu_common.c
 create mode 100644 drivers/vfio/vfio_iommu_common.h
 create mode 100644 drivers/vfio/vfio_iommu_fsl_pamu.c


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 5/7] iommu: supress loff_t compilation error on powerpc

2013-09-19 Thread Bharat Bhushan

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 drivers/vfio/pci/vfio_pci_rdwr.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
index 210db24..8a8156a 100644
--- a/drivers/vfio/pci/vfio_pci_rdwr.c
+++ b/drivers/vfio/pci/vfio_pci_rdwr.c
@@ -181,7 +181,8 @@ ssize_t vfio_pci_vga_rw(struct vfio_pci_device *vdev, char 
__user *buf,
   size_t count, loff_t *ppos, bool iswrite)
 {
int ret;
-   loff_t off, pos = *ppos  VFIO_PCI_OFFSET_MASK;
+   loff_t off;
+   u64 pos = (u64 )(*ppos  VFIO_PCI_OFFSET_MASK);
void __iomem *iomem = NULL;
unsigned int rsrc;
bool is_ioport;
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 4/7] powerpc: translate msi addr to iova if iommu is in use

2013-09-19 Thread Bharat Bhushan

If the device is attached with iommu domain then set MSI address
to the iova configured in PAMU.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/sysdev/fsl_msi.c |   56 +++-
 1 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_msi.c b/arch/powerpc/sysdev/fsl_msi.c
index ed045cb..c7cf018 100644
--- a/arch/powerpc/sysdev/fsl_msi.c
+++ b/arch/powerpc/sysdev/fsl_msi.c
@@ -18,6 +18,7 @@
 #include linux/pci.h
 #include linux/slab.h
 #include linux/of_platform.h
+#include linux/iommu.h
 #include sysdev/fsl_soc.h
 #include asm/prom.h
 #include asm/hw_irq.h
@@ -150,7 +151,40 @@ static void fsl_teardown_msi_irqs(struct pci_dev *pdev)
return;
 }
 
-static void fsl_compose_msi_msg(struct pci_dev *pdev, int hwirq,
+static uint64_t fsl_iommu_get_iova(struct pci_dev *pdev, dma_addr_t msi_phys)
+{
+   struct iommu_domain *domain;
+   struct iommu_domain_geometry geometry;
+   u32 wins = 0;
+   uint64_t iova, size;
+   int ret, i;
+
+   domain = iommu_get_dev_domain(pdev-dev);
+   if (!domain)
+   return 0;
+
+   ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_WINDOWS, wins);
+   if (ret)
+   return 0;
+
+   ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_GEOMETRY, geometry);
+   if (ret)
+   return 0;
+
+   iova = geometry.aperture_start;
+   size = geometry.aperture_end - geometry.aperture_start + 1;
+   do_div(size, wins);
+   for (i = 0; i  wins; i++) {
+   phys_addr_t phys;
+   phys = iommu_iova_to_phys(domain, iova);
+   if (phys == (msi_phys  ~(PAGE_SIZE - 1)))
+   return (iova + (msi_phys  (PAGE_SIZE - 1)));
+   iova += size;
+   }
+   return 0;
+}
+
+static int fsl_compose_msi_msg(struct pci_dev *pdev, int hwirq,
struct msi_msg *msg,
struct fsl_msi *fsl_msi_data)
 {
@@ -168,6 +202,16 @@ static void fsl_compose_msi_msg(struct pci_dev *pdev, int 
hwirq,
address = fsl_pci_immrbar_base(hose) +
   (msi_data-msiir  0xf);
 
+   /*
+* If the device is attached with iommu domain then set MSI address
+* to the iova configured in PAMU.
+*/
+   if (iommu_get_dev_domain(pdev-dev)) {
+   address = fsl_iommu_get_iova(pdev, msi_data-msiir);
+   if (!address)
+   return -ENODEV;
+   }
+
msg-address_lo = lower_32_bits(address);
msg-address_hi = upper_32_bits(address);
 
@@ -175,6 +219,8 @@ static void fsl_compose_msi_msg(struct pci_dev *pdev, int 
hwirq,
 
pr_debug(%s: allocated srs: %d, ibs: %d\n,
__func__, hwirq / IRQS_PER_MSI_REG, hwirq % IRQS_PER_MSI_REG);
+
+   return 0;
 }
 
 static int fsl_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
@@ -244,7 +290,13 @@ static int fsl_setup_msi_irqs(struct pci_dev *pdev, int 
nvec, int type)
/* chip_data is msi_data via host-hostdata in host-map() */
irq_set_msi_desc(virq, entry);
 
-   fsl_compose_msi_msg(pdev, hwirq, msg, msi_data);
+   if (fsl_compose_msi_msg(pdev, hwirq, msg, msi_data)) {
+   dev_err(pdev-dev, Fail to set MSI for hwirq %i\n,
+   hwirq);
+   msi_bitmap_free_hwirqs(msi_data-bitmap, hwirq, 1);
+   rc = -ENODEV;
+   goto out_free;
+   }
write_msi_msg(virq, msg);
}
return 0;
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 6/7] vfio: moving some functions in common file

2013-09-19 Thread Bharat Bhushan

Some function defined in vfio_iommu_type1.c were common and
we want to use these for FSL IOMMU (PAMU) and iommu-none driver.
So some of them are moved to vfio_iommu_common.c

I think we can do more of that but we will take this step by step.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 drivers/vfio/Makefile|4 +-
 drivers/vfio/vfio_iommu_common.c |  235 ++
 drivers/vfio/vfio_iommu_common.h |   30 +
 drivers/vfio/vfio_iommu_type1.c  |  206 +-
 4 files changed, 268 insertions(+), 207 deletions(-)
 create mode 100644 drivers/vfio/vfio_iommu_common.c
 create mode 100644 drivers/vfio/vfio_iommu_common.h

diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index 72bfabc..c5792ec 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -1,4 +1,4 @@
 obj-$(CONFIG_VFIO) += vfio.o
-obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_type1.o
-obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
+obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_common.o vfio_iommu_type1.o
+obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_common.o 
vfio_iommu_spapr_tce.o
 obj-$(CONFIG_VFIO_PCI) += pci/
diff --git a/drivers/vfio/vfio_iommu_common.c b/drivers/vfio/vfio_iommu_common.c
new file mode 100644
index 000..8bdc0ea
--- /dev/null
+++ b/drivers/vfio/vfio_iommu_common.c
@@ -0,0 +1,235 @@
+/*
+ * VFIO: Common code for vfio IOMMU support
+ *
+ * Copyright (C) 2012 Red Hat, Inc.  All rights reserved.
+ * Author: Alex Williamson alex.william...@redhat.com
+ * Author: Bharat Bhushan bharat.bhus...@freescale.com
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Derived from original vfio:
+ * Copyright 2010 Cisco Systems, Inc.  All rights reserved.
+ * Author: Tom Lyon, p...@cisco.com
+ */
+
+#include linux/compat.h
+#include linux/device.h
+#include linux/fs.h
+#include linux/iommu.h
+#include linux/module.h
+#include linux/mm.h
+#include linux/pci.h /* pci_bus_type */
+#include linux/rbtree.h
+#include linux/sched.h
+#include linux/slab.h
+#include linux/uaccess.h
+#include linux/vfio.h
+#include linux/workqueue.h
+
+static bool disable_hugepages;
+module_param_named(disable_hugepages,
+  disable_hugepages, bool, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(disable_hugepages,
+Disable VFIO IOMMU support for IOMMU hugepages.);
+
+struct vwork {
+   struct mm_struct*mm;
+   longnpage;
+   struct work_struct  work;
+};
+
+/* delayed decrement/increment for locked_vm */
+void vfio_lock_acct_bg(struct work_struct *work)
+{
+   struct vwork *vwork = container_of(work, struct vwork, work);
+   struct mm_struct *mm;
+
+   mm = vwork-mm;
+   down_write(mm-mmap_sem);
+   mm-locked_vm += vwork-npage;
+   up_write(mm-mmap_sem);
+   mmput(mm);
+   kfree(vwork);
+}
+
+void vfio_lock_acct(long npage)
+{
+   struct vwork *vwork;
+   struct mm_struct *mm;
+
+   if (!current-mm || !npage)
+   return; /* process exited or nothing to do */
+
+   if (down_write_trylock(current-mm-mmap_sem)) {
+   current-mm-locked_vm += npage;
+   up_write(current-mm-mmap_sem);
+   return;
+   }
+
+   /*
+* Couldn't get mmap_sem lock, so must setup to update
+* mm-locked_vm later. If locked_vm were atomic, we
+* wouldn't need this silliness
+*/
+   vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL);
+   if (!vwork)
+   return;
+   mm = get_task_mm(current);
+   if (!mm) {
+   kfree(vwork);
+   return;
+   }
+   INIT_WORK(vwork-work, vfio_lock_acct_bg);
+   vwork-mm = mm;
+   vwork-npage = npage;
+   schedule_work(vwork-work);
+}
+
+/*
+ * Some mappings aren't backed by a struct page, for example an mmap'd
+ * MMIO range for our own or another device.  These use a different
+ * pfn conversion and shouldn't be tracked as locked pages.
+ */
+bool is_invalid_reserved_pfn(unsigned long pfn)
+{
+   if (pfn_valid(pfn)) {
+   bool reserved;
+   struct page *tail = pfn_to_page(pfn);
+   struct page *head = compound_trans_head(tail);
+   reserved = !!(PageReserved(head));
+   if (head != tail) {
+   /*
+* head is not a dangling pointer
+* (compound_trans_head takes care of that)
+* but the hugepage may have been split
+* from under us (and we may not hold a
+* reference count on the head page so it can
+* be reused before we run PageReferenced), so
+* we've to check PageTail

[PATCH 7/7] vfio pci: Add vfio iommu implementation for FSL_PAMU

2013-09-19 Thread Bharat Bhushan

This patch adds vfio iommu support for Freescale IOMMU
(PAMU - Peripheral Access Management Unit).

The Freescale PAMU is an aperture-based IOMMU with the following
characteristics.  Each device has an entry in a table in memory
describing the iova-phys mapping. The mapping has:
  -an overall aperture that is power of 2 sized, and has a start iova that
   is naturally aligned
  -has 1 or more windows within the aperture
  -number of windows must be power of 2, max is 256
  -size of each window is determined by aperture size / # of windows
  -iova of each window is determined by aperture start iova / # of windows
  -the mapped region in each window can be different than
   the window size...mapping must power of 2
  -physical address of the mapping must be naturally aligned
   with the mapping size

Some of the code is derived from TYPE1 iommu (driver/vfio/vfio_iommu_type1.c).

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 drivers/vfio/Kconfig   |6 +
 drivers/vfio/Makefile  |1 +
 drivers/vfio/vfio_iommu_fsl_pamu.c |  952 
 include/uapi/linux/vfio.h  |  100 
 4 files changed, 1059 insertions(+), 0 deletions(-)
 create mode 100644 drivers/vfio/vfio_iommu_fsl_pamu.c

diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
index 26b3d9d..7d1da26 100644
--- a/drivers/vfio/Kconfig
+++ b/drivers/vfio/Kconfig
@@ -8,11 +8,17 @@ config VFIO_IOMMU_SPAPR_TCE
depends on VFIO  SPAPR_TCE_IOMMU
default n
 
+config VFIO_IOMMU_FSL_PAMU
+   tristate
+   depends on VFIO
+   default n
+
 menuconfig VFIO
tristate VFIO Non-Privileged userspace driver framework
depends on IOMMU_API
select VFIO_IOMMU_TYPE1 if X86
select VFIO_IOMMU_SPAPR_TCE if (PPC_POWERNV || PPC_PSERIES)
+   select VFIO_IOMMU_FSL_PAMU if FSL_PAMU
help
  VFIO provides a framework for secure userspace device drivers.
  See Documentation/vfio.txt for more details.
diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
index c5792ec..7461350 100644
--- a/drivers/vfio/Makefile
+++ b/drivers/vfio/Makefile
@@ -1,4 +1,5 @@
 obj-$(CONFIG_VFIO) += vfio.o
 obj-$(CONFIG_VFIO_IOMMU_TYPE1) += vfio_iommu_common.o vfio_iommu_type1.o
 obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_common.o 
vfio_iommu_spapr_tce.o
+obj-$(CONFIG_VFIO_IOMMU_FSL_PAMU) += vfio_iommu_common.o vfio_iommu_fsl_pamu.o
 obj-$(CONFIG_VFIO_PCI) += pci/
diff --git a/drivers/vfio/vfio_iommu_fsl_pamu.c 
b/drivers/vfio/vfio_iommu_fsl_pamu.c
new file mode 100644
index 000..b29365f
--- /dev/null
+++ b/drivers/vfio/vfio_iommu_fsl_pamu.c
@@ -0,0 +1,952 @@
+/*
+ * VFIO: IOMMU DMA mapping support for FSL PAMU IOMMU
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright (C) 2013 Freescale Semiconductor, Inc.
+ *
+ * Author: Bharat Bhushan bharat.bhus...@freescale.com
+ *
+ * This file is derived from driver/vfio/vfio_iommu_type1.c
+ *
+ * The Freescale PAMU is an aperture-based IOMMU with the following
+ * characteristics.  Each device has an entry in a table in memory
+ * describing the iova-phys mapping. The mapping has:
+ *  -an overall aperture that is power of 2 sized, and has a start iova that
+ *   is naturally aligned
+ *  -has 1 or more windows within the aperture
+ * -number of windows must be power of 2, max is 256
+ * -size of each window is determined by aperture size / # of windows
+ * -iova of each window is determined by aperture start iova / # of windows
+ * -the mapped region in each window can be different than
+ *  the window size...mapping must power of 2
+ * -physical address of the mapping must be naturally aligned
+ *  with the mapping size
+ */
+
+#include linux/compat.h
+#include linux/device.h
+#include linux/fs.h
+#include linux/iommu.h
+#include linux/module.h
+#include linux/mm.h
+#include linux/pci.h /* pci_bus_type */
+#include linux/sched.h
+#include linux/slab.h
+#include linux/uaccess.h
+#include linux/vfio.h
+#include linux/workqueue.h
+#include linux/hugetlb.h
+#include linux/msi.h
+#include asm/fsl_pamu_stash.h
+
+#include vfio_iommu_common.h
+
+#define DRIVER_VERSION  0.1
+#define DRIVER_AUTHOR   Bharat Bhushan bharat.bhus...@freescale.com
+#define DRIVER_DESC FSL PAMU IOMMU driver for VFIO
+
+struct vfio_iommu {
+   struct

[PATCH 5/6 v6] kvm: booke: clear host tlb reference flag on guest tlb invalidation

2013-09-19 Thread Bharat Bhushan

On booke, struct tlbe_ref contains host tlb mapping information
(pfn: for guest-pfn to pfn, flags: attribute associated with this mapping)
for a guest tlb entry. So when a guest creates a TLB entry then
struct tlbe_ref is set to point to valid pfn and set attributes in
flags field of the above said structure. When a guest TLB entry is
invalidated then flags field of corresponding struct tlbe_ref is
updated to point that this is no more valid, also we selectively clear
some other attribute bits, example: if E500_TLB_BITMAP was set then we clear
E500_TLB_BITMAP, if E500_TLB_TLB0 is set then we clear this.

Ideally we should clear complete flags as this entry is invalid and does not
have anything to re-used. The other part of the problem is that when we use
the same entry again then also we do not clear (started doing or-ing etc).

So far it was working because the selectively clearing mentioned above
actually clears flags what was set during TLB mapping. But the problem
starts coming when we add more attributes to this then we need to selectively
clear them and which is not needed.

This patch we do both
- Clear flags when invalidating;
- Clear flags when reusing same entry later

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v5-v6
 - Fix flag clearing comment

 arch/powerpc/kvm/e500_mmu_host.c |   16 
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..7370e1c 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -230,15 +230,15 @@ void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 
*vcpu_e500, int tlbsel,
ref-flags = ~(E500_TLB_TLB0 | E500_TLB_VALID);
}
 
-   /* Already invalidated in between */
-   if (!(ref-flags  E500_TLB_VALID))
-   return;
-
-   /* Guest tlbe is backed by at most one host tlbe per shadow pid. */
-   kvmppc_e500_tlbil_one(vcpu_e500, gtlbe);
+   /*
+* Check whether TLB entry is already invalidated in between
+* Guest tlbe is backed by at most one host tlbe per shadow pid.
+*/
+   if (ref-flags  E500_TLB_VALID)
+   kvmppc_e500_tlbil_one(vcpu_e500, gtlbe);
 
/* Mark the TLB as not backed by the host anymore */
-   ref-flags = ~E500_TLB_VALID;
+   ref-flags = 0;
 }
 
 static inline int tlbe_is_writable(struct kvm_book3e_206_tlb_entry *tlbe)
@@ -251,7 +251,7 @@ static inline void kvmppc_e500_ref_setup(struct tlbe_ref 
*ref,
 pfn_t pfn)
 {
ref-pfn = pfn;
-   ref-flags |= E500_TLB_VALID;
+   ref-flags = E500_TLB_VALID;
 
if (tlbe_is_writable(gtlbe))
kvm_set_pfn_dirty(pfn);
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 6/6 v4] kvm: powerpc: use caching attributes as per linux pte

2013-08-15 Thread Bharat Bhushan

KVM uses same WIM tlb attributes as the corresponding qemu pte.
For this we now search the linux pte for the requested page and
get these cache caching/coherency attributes from pte.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v3-v4
 - s/printk/printk_ratelimited till we return machine check in mmu setup

v2-v3
 - setting pgdir before kvmppc_fix_ee_before_entry() on vcpu_run
 - Aligned as per changes in patch 5/6
 - setting WIMG for pfnmap pages also
 
v1-v2
 - Use Linux pte for wimge rather than RAM/no-RAM mechanism

 arch/powerpc/include/asm/kvm_host.h |2 +-
 arch/powerpc/kvm/booke.c|2 +-
 arch/powerpc/kvm/e500.h |8 --
 arch/powerpc/kvm/e500_mmu_host.c|   38 --
 4 files changed, 29 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 3328353..583d405 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -535,6 +535,7 @@ struct kvm_vcpu_arch {
 #endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
+   pgd_t *pgdir;
 
u8 io_gpr; /* GPR used as IO source/target */
u8 mmio_is_bigendian;
@@ -592,7 +593,6 @@ struct kvm_vcpu_arch {
struct list_head run_list;
struct task_struct *run_task;
struct kvm_run *kvm_run;
-   pgd_t *pgdir;
 
spinlock_t vpa_update_lock;
struct kvmppc_vpa vpa;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 17722d8..0d96d50 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -696,8 +696,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
kvmppc_load_guest_fp(vcpu);
 #endif
 
+   vcpu-arch.pgdir = current-mm-pgd;
kvmppc_fix_ee_before_entry();
-
ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
/* No need for kvm_guest_exit. It's done in handle_exit.
diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 4fd9650..fc4b2f6 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -31,11 +31,13 @@ enum vcpu_ftr {
 #define E500_TLB_NUM   2
 
 /* entry is mapped somewhere in host TLB */
-#define E500_TLB_VALID (1  0)
+#define E500_TLB_VALID (1  31)
 /* TLB1 entry is mapped by host TLB1, tracked by bitmaps */
-#define E500_TLB_BITMAP(1  1)
+#define E500_TLB_BITMAP(1  30)
 /* TLB1 entry is mapped by host TLB0 */
-#define E500_TLB_TLB0  (1  2)
+#define E500_TLB_TLB0  (1  29)
+/* Lower 5 bits have WIMGE value */
+#define E500_TLB_WIMGE_MASK(0x1f)
 
 struct tlbe_ref {
pfn_t pfn;  /* valid only for TLB0, except briefly */
diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..603f5ba 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,15 +64,6 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
usermode)
return mas3;
 }
 
-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
-{
-#ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
-#endif
-}
-
 /*
  * writing shadow tlb entry to host TLB
  */
@@ -248,10 +239,12 @@ static inline int tlbe_is_writable(struct 
kvm_book3e_206_tlb_entry *tlbe)
 
 static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,
 struct kvm_book3e_206_tlb_entry *gtlbe,
-pfn_t pfn)
+pfn_t pfn, int wimg)
 {
ref-pfn = pfn;
ref-flags |= E500_TLB_VALID;
+   /* Use guest supplied MAS2_G and MAS2_E */
+   ref-flags |= (gtlbe-mas2  MAS2_ATTRIB_MASK) | wimg;
 
if (tlbe_is_writable(gtlbe))
kvm_set_pfn_dirty(pfn);
@@ -312,8 +305,7 @@ static void kvmppc_e500_setup_stlbe(
 
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
-   stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+   stlbe-mas2 = (gvaddr  MAS2_EPN) | (ref-flags  E500_TLB_WIMGE_MASK);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
@@ -332,6 +324,10 @@ static inline int kvmppc_e500_shadow_map(struct 
kvmppc_vcpu_e500 *vcpu_e500,
unsigned long hva;
int pfnmap = 0;
int tsize = BOOK3E_PAGESZ_4K;
+   unsigned long tsize_pages = 0;
+   pte_t *ptep;
+   int wimg = 0;
+   pgd_t *pgdir;
 
/*
 * Translate guest physical to true physical, acquiring
@@ -394,7 +390,7 @@ static inline int kvmppc_e500_shadow_map(struct 
kvmppc_vcpu_e500 *vcpu_e500,
 */
 
for (; tsize  BOOK3E_PAGESZ_4K; tsize -= 2

[PATCH 2/6 v3] kvm: powerpc: allow guest control E attribute in mas2

2013-08-06 Thread Bharat Bhushan

E bit in MAS2 bit indicates whether the page is accessed
in Little-Endian or Big-Endian byte order.
There is no reason to stop guest setting  E, so allow him.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v2-v3
 - no change
v1-v2
 - no change
 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index c2e5e98..277cb18 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
- (MAS2_X0 | MAS2_X1)
+ (MAS2_X0 | MAS2_X1 | MAS2_E)
 #define MAS3_ATTRIB_MASK \
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/6 v3] powerpc: book3e: _PAGE_LENDIAN must be _PAGE_ENDIAN

2013-08-06 Thread Bharat Bhushan

For booke3e _PAGE_ENDIAN is not defined. Infact what is defined
is _PAGE_LENDIAN which is wrong and should be _PAGE_ENDIAN.
There are no compilation errors as
arch/powerpc/include/asm/pte-common.h defines _PAGE_ENDIAN to 0
as it is not defined anywhere.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v2-v3
 - no change
v1-v2
 - no change

 arch/powerpc/include/asm/pte-book3e.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/pte-book3e.h 
b/arch/powerpc/include/asm/pte-book3e.h
index 0156702..576ad88 100644
--- a/arch/powerpc/include/asm/pte-book3e.h
+++ b/arch/powerpc/include/asm/pte-book3e.h
@@ -40,7 +40,7 @@
 #define _PAGE_U1   0x01
 #define _PAGE_U0   0x02
 #define _PAGE_ACCESSED 0x04
-#define _PAGE_LENDIAN  0x08
+#define _PAGE_ENDIAN   0x08
 #define _PAGE_GUARDED  0x10
 #define _PAGE_COHERENT 0x20 /* M: enforce memory coherence */
 #define _PAGE_NO_CACHE 0x40 /* I: cache inhibit */
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/6 v3] kvm: powerpc: use cache attributes from linux pte

2013-08-06 Thread Bharat Bhushan

From: Bharat Bhushan bharat.bhus...@freescale.com

First patch is a typo fix where book3e define _PAGE_LENDIAN while it should be
defined as _PAGE_ENDIAN. This seems to show that this is never exercised :-)

Second and third patch is to allow guest controlling G-Guarded and
E-Endian TLB attributes respectively.

Fourth and fifth patch is moving functions/logic in common code
so they can be used on booke also.

Sixth patch is actually setting caching attributes (TLB.WIMGE) using
corresponding Linux pte.

v2-v3
 - now lookup_linux_pte() only have pte search logic and it does not
   set any access flags in pte. There is already a function for setting
   access flag which will be called explicitly where needed.
   On booke we only need to search for pte to get WIMG.

v1-v2
 - Earlier caching attributes (WIMGE) were set based of page is RAM or not
   But now we get these attributes from corresponding Linux PTE.

Bharat Bhushan (6):
  powerpc: book3e: _PAGE_LENDIAN must be _PAGE_ENDIAN
  kvm: powerpc: allow guest control E attribute in mas2
  kvm: powerpc: allow guest control G attribute in mas2
  powerpc: move linux pte/hugepte search to more generic file
  kvm: powerpc: keep only pte search logic in lookup_linux_pte
  kvm: powerpc: use caching attributes as per linux pte

 arch/powerpc/include/asm/kvm_host.h  |2 +-
 arch/powerpc/include/asm/pgtable-ppc64.h |   36 --
 arch/powerpc/include/asm/pgtable.h   |   60 ++
 arch/powerpc/include/asm/pte-book3e.h|2 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c  |   38 ++-
 arch/powerpc/kvm/booke.c |2 +-
 arch/powerpc/kvm/e500.h  |   10 +++--
 arch/powerpc/kvm/e500_mmu_host.c |   36 ++---
 8 files changed, 102 insertions(+), 84 deletions(-)


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 3/6 v3] kvm: powerpc: allow guest control G attribute in mas2

2013-08-06 Thread Bharat Bhushan

G bit in MAS2 indicates whether the page is Guarded.
There is no reason to stop guest setting  G, so allow him.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v2-v3
 - no change
v1-v2
 - no change
 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 277cb18..4fd9650 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
- (MAS2_X0 | MAS2_X1 | MAS2_E)
+ (MAS2_X0 | MAS2_X1 | MAS2_E | MAS2_G)
 #define MAS3_ATTRIB_MASK \
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 4/6 v3] powerpc: move linux pte/hugepte search to more generic file

2013-08-06 Thread Bharat Bhushan

Linux pte search functions find_linux_pte_or_hugepte() and
find_linux_pte() have nothing specific to 64bit anymore.
So they are move from pgtable-ppc64.h to asm/pgtable.h

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v2-v3
 - no change
v1-v2
 - This is a new change in this version
 
 arch/powerpc/include/asm/pgtable-ppc64.h |   36 -
 arch/powerpc/include/asm/pgtable.h   |   37 ++
 2 files changed, 37 insertions(+), 36 deletions(-)

diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h 
b/arch/powerpc/include/asm/pgtable-ppc64.h
index e3d55f6..d257d98 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -340,42 +340,6 @@ static inline void __ptep_set_access_flags(pte_t *ptep, 
pte_t entry)
 void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
 void pgtable_cache_init(void);
 
-/*
- * find_linux_pte returns the address of a linux pte for a given
- * effective address and directory.  If not found, it returns zero.
- */
-static inline pte_t *find_linux_pte(pgd_t *pgdir, unsigned long ea)
-{
-   pgd_t *pg;
-   pud_t *pu;
-   pmd_t *pm;
-   pte_t *pt = NULL;
-
-   pg = pgdir + pgd_index(ea);
-   if (!pgd_none(*pg)) {
-   pu = pud_offset(pg, ea);
-   if (!pud_none(*pu)) {
-   pm = pmd_offset(pu, ea);
-   if (pmd_present(*pm))
-   pt = pte_offset_kernel(pm, ea);
-   }
-   }
-   return pt;
-}
-
-#ifdef CONFIG_HUGETLB_PAGE
-pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
-unsigned *shift);
-#else
-static inline pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
-  unsigned *shift)
-{
-   if (shift)
-   *shift = 0;
-   return find_linux_pte(pgdir, ea);
-}
-#endif /* !CONFIG_HUGETLB_PAGE */
-
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_PGTABLE_PPC64_H_ */
diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index b6293d2..690c8c2 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -217,6 +217,43 @@ extern int gup_hugepd(hugepd_t *hugepd, unsigned pdshift, 
unsigned long addr,
 
 extern int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
   unsigned long end, int write, struct page **pages, int 
*nr);
+
+/*
+ * find_linux_pte returns the address of a linux pte for a given
+ * effective address and directory.  If not found, it returns zero.
+ */
+static inline pte_t *find_linux_pte(pgd_t *pgdir, unsigned long ea)
+{
+   pgd_t *pg;
+   pud_t *pu;
+   pmd_t *pm;
+   pte_t *pt = NULL;
+
+   pg = pgdir + pgd_index(ea);
+   if (!pgd_none(*pg)) {
+   pu = pud_offset(pg, ea);
+   if (!pud_none(*pu)) {
+   pm = pmd_offset(pu, ea);
+   if (pmd_present(*pm))
+   pt = pte_offset_kernel(pm, ea);
+   }
+   }
+   return pt;
+}
+
+#ifdef CONFIG_HUGETLB_PAGE
+pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
+unsigned *shift);
+#else
+static inline pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
+  unsigned *shift)
+{
+   if (shift)
+   *shift = 0;
+   return find_linux_pte(pgdir, ea);
+}
+#endif /* !CONFIG_HUGETLB_PAGE */
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __KERNEL__ */
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 6/6 v3] kvm: powerpc: use caching attributes as per linux pte

2013-08-06 Thread Bharat Bhushan

KVM uses same WIM tlb attributes as the corresponding qemu pte.
For this we now search the linux pte for the requested page and
get these cache caching/coherency attributes from pte.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v2-v3
 - setting pgdir before kvmppc_fix_ee_before_entry() on vcpu_run
 - Aligned as per changes in patch 5/6
 - setting WIMG for pfnmap pages also
 
v1-v2
 - Use Linux pte for wimge rather than RAM/no-RAM mechanism

 arch/powerpc/include/asm/kvm_host.h |2 +-
 arch/powerpc/kvm/booke.c|2 +-
 arch/powerpc/kvm/e500.h |8 --
 arch/powerpc/kvm/e500_mmu_host.c|   36 --
 4 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 3328353..583d405 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -535,6 +535,7 @@ struct kvm_vcpu_arch {
 #endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
+   pgd_t *pgdir;
 
u8 io_gpr; /* GPR used as IO source/target */
u8 mmio_is_bigendian;
@@ -592,7 +593,6 @@ struct kvm_vcpu_arch {
struct list_head run_list;
struct task_struct *run_task;
struct kvm_run *kvm_run;
-   pgd_t *pgdir;
 
spinlock_t vpa_update_lock;
struct kvmppc_vpa vpa;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 17722d8..0d96d50 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -696,8 +696,8 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
kvmppc_load_guest_fp(vcpu);
 #endif
 
+   vcpu-arch.pgdir = current-mm-pgd;
kvmppc_fix_ee_before_entry();
-
ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
/* No need for kvm_guest_exit. It's done in handle_exit.
diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 4fd9650..fc4b2f6 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -31,11 +31,13 @@ enum vcpu_ftr {
 #define E500_TLB_NUM   2
 
 /* entry is mapped somewhere in host TLB */
-#define E500_TLB_VALID (1  0)
+#define E500_TLB_VALID (1  31)
 /* TLB1 entry is mapped by host TLB1, tracked by bitmaps */
-#define E500_TLB_BITMAP(1  1)
+#define E500_TLB_BITMAP(1  30)
 /* TLB1 entry is mapped by host TLB0 */
-#define E500_TLB_TLB0  (1  2)
+#define E500_TLB_TLB0  (1  29)
+/* Lower 5 bits have WIMGE value */
+#define E500_TLB_WIMGE_MASK(0x1f)
 
 struct tlbe_ref {
pfn_t pfn;  /* valid only for TLB0, except briefly */
diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..001a2b0 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,15 +64,6 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
usermode)
return mas3;
 }
 
-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
-{
-#ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
-#endif
-}
-
 /*
  * writing shadow tlb entry to host TLB
  */
@@ -248,10 +239,12 @@ static inline int tlbe_is_writable(struct 
kvm_book3e_206_tlb_entry *tlbe)
 
 static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,
 struct kvm_book3e_206_tlb_entry *gtlbe,
-pfn_t pfn)
+pfn_t pfn, int wimg)
 {
ref-pfn = pfn;
ref-flags |= E500_TLB_VALID;
+   /* Use guest supplied MAS2_G and MAS2_E */
+   ref-flags |= (gtlbe-mas2  MAS2_ATTRIB_MASK) | wimg;
 
if (tlbe_is_writable(gtlbe))
kvm_set_pfn_dirty(pfn);
@@ -312,8 +305,7 @@ static void kvmppc_e500_setup_stlbe(
 
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
-   stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+   stlbe-mas2 = (gvaddr  MAS2_EPN) | (ref-flags  E500_TLB_WIMGE_MASK);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
@@ -332,6 +324,10 @@ static inline int kvmppc_e500_shadow_map(struct 
kvmppc_vcpu_e500 *vcpu_e500,
unsigned long hva;
int pfnmap = 0;
int tsize = BOOK3E_PAGESZ_4K;
+   unsigned long tsize_pages = 0;
+   pte_t *ptep;
+   int wimg = 0;
+   pgd_t *pgdir;
 
/*
 * Translate guest physical to true physical, acquiring
@@ -394,7 +390,7 @@ static inline int kvmppc_e500_shadow_map(struct 
kvmppc_vcpu_e500 *vcpu_e500,
 */
 
for (; tsize  BOOK3E_PAGESZ_4K; tsize -= 2) {
-   unsigned long gfn_start, gfn_end, tsize_pages

[PATCH 5/6 v3] kvm: powerpc: keep only pte search logic in lookup_linux_pte

2013-08-06 Thread Bharat Bhushan

lookup_linux_pte() was searching for a pte and also sets access
flags is writable. This function now searches only pte while
access flag setting is done explicitly.

This pte lookup is not kvm specific, so moved to common code (asm/pgtable.h)
My Followup patch will use this on booke.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v2-v3
 - New change

 arch/powerpc/include/asm/pgtable.h  |   23 +
 arch/powerpc/kvm/book3s_hv_rm_mmu.c |   38 +++---
 2 files changed, 35 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index 690c8c2..d4d16ab 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -254,6 +254,29 @@ static inline pte_t *find_linux_pte_or_hugepte(pgd_t 
*pgdir, unsigned long ea,
 }
 #endif /* !CONFIG_HUGETLB_PAGE */
 
+static inline pte_t *lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
+unsigned long *pte_sizep)
+{
+   pte_t *ptep;
+   unsigned long ps = *pte_sizep;
+   unsigned int shift;
+
+   ptep = find_linux_pte_or_hugepte(pgdir, hva, shift);
+   if (!ptep)
+   return __pte(0);
+   if (shift)
+   *pte_sizep = 1ul  shift;
+   else
+   *pte_sizep = PAGE_SIZE;
+
+   if (ps  *pte_sizep)
+   return __pte(0);
+
+   if (!pte_present(*ptep))
+   return __pte(0);
+
+   return ptep;
+}
 #endif /* __ASSEMBLY__ */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 105b00f..7e6200c 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -134,27 +134,6 @@ static void remove_revmap_chain(struct kvm *kvm, long 
pte_index,
unlock_rmap(rmap);
 }
 
-static pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
- int writing, unsigned long *pte_sizep)
-{
-   pte_t *ptep;
-   unsigned long ps = *pte_sizep;
-   unsigned int shift;
-
-   ptep = find_linux_pte_or_hugepte(pgdir, hva, shift);
-   if (!ptep)
-   return __pte(0);
-   if (shift)
-   *pte_sizep = 1ul  shift;
-   else
-   *pte_sizep = PAGE_SIZE;
-   if (ps  *pte_sizep)
-   return __pte(0);
-   if (!pte_present(*ptep))
-   return __pte(0);
-   return kvmppc_read_update_linux_pte(ptep, writing);
-}
-
 static inline void unlock_hpte(unsigned long *hpte, unsigned long hpte_v)
 {
asm volatile(PPC_RELEASE_BARRIER  : : : memory);
@@ -174,6 +153,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
unsigned long *physp, pte_size;
unsigned long is_io;
unsigned long *rmap;
+   pte_t *ptep;
pte_t pte;
unsigned int writing;
unsigned long mmu_seq;
@@ -233,8 +213,9 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
 
/* Look up the Linux PTE for the backing page */
pte_size = psize;
-   pte = lookup_linux_pte(pgdir, hva, writing, pte_size);
-   if (pte_present(pte)) {
+   ptep = lookup_linux_pte(pgdir, hva, pte_size);
+   if (pte_present(pte_val(*ptep))) {
+   pte = kvmppc_read_update_linux_pte(ptep, writing);
if (writing  !pte_write(pte))
/* make the actual HPTE be read-only */
ptel = hpte_make_readonly(ptel);
@@ -662,6 +643,7 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long 
flags,
unsigned long psize, gfn, hva;
struct kvm_memory_slot *memslot;
pgd_t *pgdir = vcpu-arch.pgdir;
+   pte_t *ptep;
pte_t pte;
 
psize = hpte_page_size(v, r);
@@ -669,9 +651,13 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long 
flags,
memslot = __gfn_to_memslot(kvm_memslots(kvm), gfn);
if (memslot) {
hva = __gfn_to_hva_memslot(memslot, gfn);
-   pte = lookup_linux_pte(pgdir, hva, 1, psize);
-   if (pte_present(pte)  !pte_write(pte))
-   r = hpte_make_readonly(r);
+   ptep = lookup_linux_pte(pgdir, hva, psize);
+   if (pte_present(pte_val(*ptep))) {
+   pte = kvmppc_read_update_linux_pte(ptep,
+  1);
+   if (pte_present(pte)  !pte_write(pte))
+   r = hpte_make_readonly(r

[PATCH 0/6 v2] kvm: powerpc: use cache attributes from linux pte

2013-08-01 Thread Bharat Bhushan

From: Bharat Bhushan bharat.bhus...@freescale.com

First patch is a typo fix where book3e define _PAGE_LENDIAN while it should be
defined as _PAGE_ENDIAN. This seems to show that this is never exercised :-)

Second and third patch is to allow guest controlling G-Guarded and
E-Endiany TLB attributes respectively.

Rest of patches is about setting caching attributes (TLB.WIMGE) using
corresponding Linux pte.

v1-v2
 - Earlier caching attributes (WIMGE) were set based of page is RAM or not
   But now we get these attributes from corresponding Linux PTE.

Bharat Bhushan (6):
  powerpc: book3e: _PAGE_LENDIAN must be _PAGE_ENDIAN
  kvm: powerpc: allow guest control E attribute in mas2
  kvm: powerpc: allow guest control G attribute in mas2
  powerpc: move linux pte/hugepte search to more generic file
  kvm: powerpc: booke: Add linux pte lookup like booke3s
  kvm: powerpc: use caching attributes as per linux pte

 arch/powerpc/include/asm/kvm_booke.h |   73 ++
 arch/powerpc/include/asm/kvm_host.h  |2 +-
 arch/powerpc/include/asm/pgtable-ppc64.h |   36 ---
 arch/powerpc/include/asm/pgtable.h   |   37 +++
 arch/powerpc/include/asm/pte-book3e.h|2 +-
 arch/powerpc/kvm/booke.c |2 +-
 arch/powerpc/kvm/e500.h  |   10 +++--
 arch/powerpc/kvm/e500_mmu_host.c |   31 +++-
 8 files changed, 137 insertions(+), 56 deletions(-)


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/6 v2] powerpc: book3e: _PAGE_LENDIAN must be _PAGE_ENDIAN

2013-08-01 Thread Bharat Bhushan

For booke3e _PAGE_ENDIAN is not defined. Infact what is defined
is _PAGE_LENDIAN which is wrong and should be _PAGE_ENDIAN.
There are no compilation errors as
arch/powerpc/include/asm/pte-common.h defines _PAGE_ENDIAN to 0
as it is not defined anywhere.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v2
 - no change

 arch/powerpc/include/asm/pte-book3e.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/pte-book3e.h 
b/arch/powerpc/include/asm/pte-book3e.h
index 0156702..576ad88 100644
--- a/arch/powerpc/include/asm/pte-book3e.h
+++ b/arch/powerpc/include/asm/pte-book3e.h
@@ -40,7 +40,7 @@
 #define _PAGE_U1   0x01
 #define _PAGE_U0   0x02
 #define _PAGE_ACCESSED 0x04
-#define _PAGE_LENDIAN  0x08
+#define _PAGE_ENDIAN   0x08
 #define _PAGE_GUARDED  0x10
 #define _PAGE_COHERENT 0x20 /* M: enforce memory coherence */
 #define _PAGE_NO_CACHE 0x40 /* I: cache inhibit */
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/6 v2] kvm: powerpc: allow guest control E attribute in mas2

2013-08-01 Thread Bharat Bhushan

E bit in MAS2 bit indicates whether the page is accessed
in Little-Endian or Big-Endian byte order.
There is no reason to stop guest setting  E, so allow him.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v2
 - no change

 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index c2e5e98..277cb18 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
- (MAS2_X0 | MAS2_X1)
+ (MAS2_X0 | MAS2_X1 | MAS2_E)
 #define MAS3_ATTRIB_MASK \
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 3/6 v2] kvm: powerpc: allow guest control G attribute in mas2

2013-08-01 Thread Bharat Bhushan

G bit in MAS2 indicates whether the page is Guarded.
There is no reason to stop guest setting  E, so allow him.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v2
 - no change

 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 277cb18..4fd9650 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
- (MAS2_X0 | MAS2_X1 | MAS2_E)
+ (MAS2_X0 | MAS2_X1 | MAS2_E | MAS2_G)
 #define MAS3_ATTRIB_MASK \
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 4/6 v2] powerpc: move linux pte/hugepte search to more generic file

2013-08-01 Thread Bharat Bhushan

Linux pte search functions find_linux_pte_or_hugepte() and
find_linux_pte() have nothing specific to 64bit anymore.
So they are move from pgtable-ppc64.h to asm/pgtable.h

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v2
 - This is a new change in this version

 arch/powerpc/include/asm/pgtable-ppc64.h |   36 -
 arch/powerpc/include/asm/pgtable.h   |   37 ++
 2 files changed, 37 insertions(+), 36 deletions(-)

diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h 
b/arch/powerpc/include/asm/pgtable-ppc64.h
index e3d55f6..d257d98 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -340,42 +340,6 @@ static inline void __ptep_set_access_flags(pte_t *ptep, 
pte_t entry)
 void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
 void pgtable_cache_init(void);
 
-/*
- * find_linux_pte returns the address of a linux pte for a given
- * effective address and directory.  If not found, it returns zero.
- */
-static inline pte_t *find_linux_pte(pgd_t *pgdir, unsigned long ea)
-{
-   pgd_t *pg;
-   pud_t *pu;
-   pmd_t *pm;
-   pte_t *pt = NULL;
-
-   pg = pgdir + pgd_index(ea);
-   if (!pgd_none(*pg)) {
-   pu = pud_offset(pg, ea);
-   if (!pud_none(*pu)) {
-   pm = pmd_offset(pu, ea);
-   if (pmd_present(*pm))
-   pt = pte_offset_kernel(pm, ea);
-   }
-   }
-   return pt;
-}
-
-#ifdef CONFIG_HUGETLB_PAGE
-pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
-unsigned *shift);
-#else
-static inline pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
-  unsigned *shift)
-{
-   if (shift)
-   *shift = 0;
-   return find_linux_pte(pgdir, ea);
-}
-#endif /* !CONFIG_HUGETLB_PAGE */
-
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_PGTABLE_PPC64_H_ */
diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index b6293d2..690c8c2 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -217,6 +217,43 @@ extern int gup_hugepd(hugepd_t *hugepd, unsigned pdshift, 
unsigned long addr,
 
 extern int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
   unsigned long end, int write, struct page **pages, int 
*nr);
+
+/*
+ * find_linux_pte returns the address of a linux pte for a given
+ * effective address and directory.  If not found, it returns zero.
+ */
+static inline pte_t *find_linux_pte(pgd_t *pgdir, unsigned long ea)
+{
+   pgd_t *pg;
+   pud_t *pu;
+   pmd_t *pm;
+   pte_t *pt = NULL;
+
+   pg = pgdir + pgd_index(ea);
+   if (!pgd_none(*pg)) {
+   pu = pud_offset(pg, ea);
+   if (!pud_none(*pu)) {
+   pm = pmd_offset(pu, ea);
+   if (pmd_present(*pm))
+   pt = pte_offset_kernel(pm, ea);
+   }
+   }
+   return pt;
+}
+
+#ifdef CONFIG_HUGETLB_PAGE
+pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
+unsigned *shift);
+#else
+static inline pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
+  unsigned *shift)
+{
+   if (shift)
+   *shift = 0;
+   return find_linux_pte(pgdir, ea);
+}
+#endif /* !CONFIG_HUGETLB_PAGE */
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __KERNEL__ */
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup like booke3s

2013-08-01 Thread Bharat Bhushan

KVM need to lookup linux pte for getting TLB attributes (WIMGE).
This is similar to how book3s does.
This will be used in follow-up patches.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v2
 - This is a new change in this version

 arch/powerpc/include/asm/kvm_booke.h |   73 ++
 1 files changed, 73 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_booke.h 
b/arch/powerpc/include/asm/kvm_booke.h
index d3c1eb3..903624d 100644
--- a/arch/powerpc/include/asm/kvm_booke.h
+++ b/arch/powerpc/include/asm/kvm_booke.h
@@ -102,4 +102,77 @@ static inline ulong kvmppc_get_msr(struct kvm_vcpu *vcpu)
 {
return vcpu-arch.shared-msr;
 }
+
+/*
+ * Lock and read a linux PTE.  If it's present and writable, atomically
+ * set dirty and referenced bits and return the PTE, otherwise return 0.
+ */
+static inline pte_t kvmppc_read_update_linux_pte(pte_t *p, int writing)
+{
+   pte_t pte;
+
+#ifdef PTE_ATOMIC_UPDATES
+   pte_t tmp;
+/* wait until _PAGE_BUSY is clear then set it atomically */
+#ifdef CONFIG_PPC64
+   __asm__ __volatile__ (
+   1: ldarx   %0,0,%3\n
+  andi.   %1,%0,%4\n
+  bne-1b\n
+  ori %1,%0,%4\n
+  stdcx.  %1,0,%3\n
+  bne-1b
+   : =r (pte), =r (tmp), =m (*p)
+   : r (p), i (_PAGE_BUSY)
+   : cc);
+#else
+__asm__ __volatile__ (
+1: lwarx   %0,0,%3\n
+   andi.   %1,%0,%4\n
+   bne-1b\n
+   ori %1,%0,%4\n
+   stwcx.  %1,0,%3\n
+   bne-1b
+: =r (pte), =r (tmp), =m (*p)
+: r (p), i (_PAGE_BUSY)
+: cc);
+#endif
+#else
+   pte = pte_val(*p);
+#endif
+
+   if (pte_present(pte)) {
+   pte = pte_mkyoung(pte);
+   if (writing  pte_write(pte))
+   pte = pte_mkdirty(pte);
+   }
+
+   *p = pte;   /* clears _PAGE_BUSY */
+
+   return pte;
+}
+
+static inline pte_t lookup_linux_pte(pgd_t *pgdir, unsigned long hva,
+ int writing, unsigned long *pte_sizep)
+{
+   pte_t *ptep;
+   unsigned long ps = *pte_sizep;
+   unsigned int shift;
+
+   ptep = find_linux_pte_or_hugepte(pgdir, hva, shift);
+   if (!ptep)
+   return __pte(0);
+   if (shift)
+   *pte_sizep = 1ul  shift;
+   else
+   *pte_sizep = PAGE_SIZE;
+
+   if (ps  *pte_sizep)
+   return __pte(0);
+   if (!pte_present(*ptep))
+   return __pte(0);
+
+   return kvmppc_read_update_linux_pte(ptep, writing);
+}
+
 #endif /* __ASM_KVM_BOOKE_H__ */
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 6/6 v2] kvm: powerpc: use caching attributes as per linux pte

2013-08-01 Thread Bharat Bhushan

KVM uses same WIM tlb attributes as the corresponding qemu pte.
For this we now search the linux pte for the requested page and
get these cache caching/coherency attributes from pte.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v1-v2
 - Use Linux pte for wimge rather than RAM/no-RAM mechanism

 arch/powerpc/include/asm/kvm_host.h |2 +-
 arch/powerpc/kvm/booke.c|2 +-
 arch/powerpc/kvm/e500.h |8 +---
 arch/powerpc/kvm/e500_mmu_host.c|   31 ++-
 4 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 3328353..583d405 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -535,6 +535,7 @@ struct kvm_vcpu_arch {
 #endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
+   pgd_t *pgdir;
 
u8 io_gpr; /* GPR used as IO source/target */
u8 mmio_is_bigendian;
@@ -592,7 +593,6 @@ struct kvm_vcpu_arch {
struct list_head run_list;
struct task_struct *run_task;
struct kvm_run *kvm_run;
-   pgd_t *pgdir;
 
spinlock_t vpa_update_lock;
struct kvmppc_vpa vpa;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 17722d8..eb2 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -697,7 +697,7 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
 #endif
 
kvmppc_fix_ee_before_entry();
-
+   vcpu-arch.pgdir = current-mm-pgd;
ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
/* No need for kvm_guest_exit. It's done in handle_exit.
diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 4fd9650..fc4b2f6 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -31,11 +31,13 @@ enum vcpu_ftr {
 #define E500_TLB_NUM   2
 
 /* entry is mapped somewhere in host TLB */
-#define E500_TLB_VALID (1  0)
+#define E500_TLB_VALID (1  31)
 /* TLB1 entry is mapped by host TLB1, tracked by bitmaps */
-#define E500_TLB_BITMAP(1  1)
+#define E500_TLB_BITMAP(1  30)
 /* TLB1 entry is mapped by host TLB0 */
-#define E500_TLB_TLB0  (1  2)
+#define E500_TLB_TLB0  (1  29)
+/* Lower 5 bits have WIMGE value */
+#define E500_TLB_WIMGE_MASK(0x1f)
 
 struct tlbe_ref {
pfn_t pfn;  /* valid only for TLB0, except briefly */
diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..9b10b0b 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,15 +64,6 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
usermode)
return mas3;
 }
 
-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
-{
-#ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
-#endif
-}
-
 /*
  * writing shadow tlb entry to host TLB
  */
@@ -248,10 +239,12 @@ static inline int tlbe_is_writable(struct 
kvm_book3e_206_tlb_entry *tlbe)
 
 static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref,
 struct kvm_book3e_206_tlb_entry *gtlbe,
-pfn_t pfn)
+pfn_t pfn, int wimg)
 {
ref-pfn = pfn;
ref-flags |= E500_TLB_VALID;
+   /* Use guest supplied MAS2_G and MAS2_E */
+   ref-flags |= (gtlbe-mas2  MAS2_ATTRIB_MASK) | wimg;
 
if (tlbe_is_writable(gtlbe))
kvm_set_pfn_dirty(pfn);
@@ -312,8 +305,7 @@ static void kvmppc_e500_setup_stlbe(
 
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
-   stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+   stlbe-mas2 = (gvaddr  MAS2_EPN) | (ref-flags  E500_TLB_WIMGE_MASK);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
@@ -332,6 +324,8 @@ static inline int kvmppc_e500_shadow_map(struct 
kvmppc_vcpu_e500 *vcpu_e500,
unsigned long hva;
int pfnmap = 0;
int tsize = BOOK3E_PAGESZ_4K;
+   pte_t pte;
+   int wimg = 0;
 
/*
 * Translate guest physical to true physical, acquiring
@@ -437,6 +431,8 @@ static inline int kvmppc_e500_shadow_map(struct 
kvmppc_vcpu_e500 *vcpu_e500,
 
if (likely(!pfnmap)) {
unsigned long tsize_pages = 1  (tsize + 10 - PAGE_SHIFT);
+   pgd_t *pgdir;
+
pfn = gfn_to_pfn_memslot(slot, gfn);
if (is_error_noslot_pfn(pfn)) {
printk(KERN_ERR Couldn't get real page for gfn %lx!\n,
@@ -447,9 +443,18 @@ static inline int kvmppc_e500_shadow_map(struct 
kvmppc_vcpu_e500 *vcpu_e500,
/* Align guest

[PATCH 1/4] powerpc: book3e: _PAGE_LENDIAN must be _PAGE_ENDIAN

2013-07-25 Thread Bharat Bhushan

For booke3e _PAGE_ENDIAN is not defined. Infact what is defined
is _PAGE_LENDIAN which is wrong and that should be _PAGE_ENDIAN.
There are no compilation errors as
arch/powerpc/include/asm/pte-common.h defines _PAGE_ENDIAN to 0
as it is not defined anywhere.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/include/asm/pte-book3e.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/pte-book3e.h 
b/arch/powerpc/include/asm/pte-book3e.h
index 0156702..576ad88 100644
--- a/arch/powerpc/include/asm/pte-book3e.h
+++ b/arch/powerpc/include/asm/pte-book3e.h
@@ -40,7 +40,7 @@
 #define _PAGE_U1   0x01
 #define _PAGE_U0   0x02
 #define _PAGE_ACCESSED 0x04
-#define _PAGE_LENDIAN  0x08
+#define _PAGE_ENDIAN   0x08
 #define _PAGE_GUARDED  0x10
 #define _PAGE_COHERENT 0x20 /* M: enforce memory coherence */
 #define _PAGE_NO_CACHE 0x40 /* I: cache inhibit */
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 3/4] kvm: powerpc: allow guest control G attribute in mas2

2013-07-25 Thread Bharat Bhushan

G bit in MAS2 indicates whether the page is Guarded.
There is no reason to stop guest setting  G, so allow him.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 277cb18..4fd9650 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
- (MAS2_X0 | MAS2_X1 | MAS2_E)
+ (MAS2_X0 | MAS2_X1 | MAS2_E | MAS2_G)
 #define MAS3_ATTRIB_MASK \
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/4] kvm: powerpc: allow guest control E attribute in mas2

2013-07-25 Thread Bharat Bhushan

E bit in MAS2 bit indicates whether the page is accessed
in Little-Endian or Big-Endian byte order.
There is no reason to stop guest setting  E, so allow him.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index c2e5e98..277cb18 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
- (MAS2_X0 | MAS2_X1)
+ (MAS2_X0 | MAS2_X1 | MAS2_E)
 #define MAS3_ATTRIB_MASK \
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages

2013-07-25 Thread Bharat Bhushan

If the page is RAM then map this as cacheable and coherent (set M bit)
otherwise this page is treated as I/O and map this as cache inhibited
and guarded (set  I + G)

This helps setting proper MMU mapping for direct assigned device.

NOTE: There can be devices that require cacheable mapping, which is not yet 
supported.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500_mmu_host.c |   24 +++-
 1 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..5cbdc8f 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,27 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
usermode)
return mas3;
 }
 
-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (kvm_is_mmio_pfn(pfn)) {
+   /*
+* If page is not RAM then it is treated as I/O page.
+* Map it with cache inhibited and guarded (set I + G).
+*/
+   mas2_attr |= MAS2_I | MAS2_G;
+   return mas2_attr;
+   }
+
+   /* Map RAM pages as cacheable (Not setting I in MAS2) */
 #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   /* Also map as coherent (set M) in SMP */
+   mas2_attr |= MAS2_M;
 #endif
+   return mas2_attr;
 }
 
 /*
@@ -313,7 +327,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/2] powerpc: allow kvm to use kerel debug framework

2013-07-04 Thread Bharat Bhushan

From: Bharat Bhushan bharat.bhus...@freescale.com

This patchset moves the debug registers in a structure, which allows
kvm to use same structure for debug emulation.

Note: Earilier a patchset 
https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-June/108132.html;
was sent which is a bunch of six patches. That patchset is divided into two 
parts:
1) powerpc specific changes (These 2 patches are actually have those 
changes)
2) KVM specific changes (will send separate patch on agraf repository)

Bharat Bhushan (2):
  powerpc: remove unnecessary line continuations
  powerpc: move debug registers in a structure

 arch/powerpc/include/asm/processor.h |   38 +
 arch/powerpc/include/asm/reg_booke.h |8 +-
 arch/powerpc/kernel/asm-offsets.c|2 +-
 arch/powerpc/kernel/process.c|   42 +-
 arch/powerpc/kernel/ptrace.c |  154 +-
 arch/powerpc/kernel/ptrace32.c   |2 +-
 arch/powerpc/kernel/signal_32.c  |6 +-
 arch/powerpc/kernel/traps.c  |   35 
 8 files changed, 147 insertions(+), 140 deletions(-)


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2] powerpc: remove unnecessary line continuations

2013-07-04 Thread Bharat Bhushan

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kernel/process.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index c517dbe..19b8733 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -325,7 +325,7 @@ static void set_debug_reg_defaults(struct thread_struct 
*thread)
/*
 * Force User/Supervisor bits to b11 (user-only MSR[PR]=1)
 */
-   thread-dbcr1 = DBCR1_IAC1US | DBCR1_IAC2US |   \
+   thread-dbcr1 = DBCR1_IAC1US | DBCR1_IAC2US |
DBCR1_IAC3US | DBCR1_IAC4US;
/*
 * Force Data Address Compare User/Supervisor bits to be User-only
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/2] powerpc: move debug registers in a structure

2013-07-04 Thread Bharat Bhushan

This way we can use same data type struct with KVM and
also help in using other debug related function.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/include/asm/processor.h |   38 +
 arch/powerpc/include/asm/reg_booke.h |8 +-
 arch/powerpc/kernel/asm-offsets.c|2 +-
 arch/powerpc/kernel/process.c|   42 +-
 arch/powerpc/kernel/ptrace.c |  154 +-
 arch/powerpc/kernel/ptrace32.c   |2 +-
 arch/powerpc/kernel/signal_32.c  |6 +-
 arch/powerpc/kernel/traps.c  |   35 
 8 files changed, 147 insertions(+), 140 deletions(-)

diff --git a/arch/powerpc/include/asm/processor.h 
b/arch/powerpc/include/asm/processor.h
index 47a35b0..9e9aa26 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -147,22 +147,7 @@ typedef struct {
 #define TS_FPR(i) fpr[i][TS_FPROFFSET]
 #define TS_TRANS_FPR(i) transact_fpr[i][TS_FPROFFSET]
 
-struct thread_struct {
-   unsigned long   ksp;/* Kernel stack pointer */
-   unsigned long   ksp_limit;  /* if ksp = ksp_limit stack overflow */
-
-#ifdef CONFIG_PPC64
-   unsigned long   ksp_vsid;
-#endif
-   struct pt_regs  *regs;  /* Pointer to saved register state */
-   mm_segment_tfs; /* for get_fs() validation */
-#ifdef CONFIG_BOOKE
-   /* BookE base exception scratch space; align on cacheline */
-   unsigned long   normsave[8] cacheline_aligned;
-#endif
-#ifdef CONFIG_PPC32
-   void*pgdir; /* root of page-table tree */
-#endif
+struct debug_reg {
 #ifdef CONFIG_PPC_ADV_DEBUG_REGS
/*
 * The following help to manage the use of Debug Control Registers
@@ -199,6 +184,27 @@ struct thread_struct {
unsigned long   dvc2;
 #endif
 #endif
+};
+
+struct thread_struct {
+   unsigned long   ksp;/* Kernel stack pointer */
+   unsigned long   ksp_limit;  /* if ksp = ksp_limit stack overflow */
+
+#ifdef CONFIG_PPC64
+   unsigned long   ksp_vsid;
+#endif
+   struct pt_regs  *regs;  /* Pointer to saved register state */
+   mm_segment_tfs; /* for get_fs() validation */
+#ifdef CONFIG_BOOKE
+   /* BookE base exception scratch space; align on cacheline */
+   unsigned long   normsave[8] cacheline_aligned;
+#endif
+#ifdef CONFIG_PPC32
+   void*pgdir; /* root of page-table tree */
+#endif
+   /* Debug Registers */
+   struct debug_reg debug;
+
/* FP and VSX 0-31 register set */
double  fpr[32][TS_FPRWIDTH] __attribute__((aligned(16)));
struct {
diff --git a/arch/powerpc/include/asm/reg_booke.h 
b/arch/powerpc/include/asm/reg_booke.h
index b417de3..455dc89 100644
--- a/arch/powerpc/include/asm/reg_booke.h
+++ b/arch/powerpc/include/asm/reg_booke.h
@@ -381,7 +381,7 @@
 #define DBCR0_IA34T0x4000  /* Instr Addr 3-4 range Toggle */
 #define DBCR0_FT   0x0001  /* Freeze Timers on debug event */
 
-#define dbcr_iac_range(task)   ((task)-thread.dbcr0)
+#define dbcr_iac_range(task)   ((task)-thread.debug.dbcr0)
 #define DBCR_IAC12IDBCR0_IA12  /* Range Inclusive */
 #define DBCR_IAC12X(DBCR0_IA12 | DBCR0_IA12X)  /* Range Exclusive */
 #define DBCR_IAC12MODE (DBCR0_IA12 | DBCR0_IA12X)  /* IAC 1-2 Mode Bits */
@@ -395,7 +395,7 @@
 #define DBCR1_DAC1W0x2000  /* DAC1 Write Debug Event */
 #define DBCR1_DAC2W0x1000  /* DAC2 Write Debug Event */
 
-#define dbcr_dac(task) ((task)-thread.dbcr1)
+#define dbcr_dac(task) ((task)-thread.debug.dbcr1)
 #define DBCR_DAC1R DBCR1_DAC1R
 #define DBCR_DAC1W DBCR1_DAC1W
 #define DBCR_DAC2R DBCR1_DAC2R
@@ -441,7 +441,7 @@
 #define DBCR0_CRET 0x0020  /* Critical Return Debug Event */
 #define DBCR0_FT   0x0001  /* Freeze Timers on debug event */
 
-#define dbcr_dac(task) ((task)-thread.dbcr0)
+#define dbcr_dac(task) ((task)-thread.debug.dbcr0)
 #define DBCR_DAC1R DBCR0_DAC1R
 #define DBCR_DAC1W DBCR0_DAC1W
 #define DBCR_DAC2R DBCR0_DAC2R
@@ -475,7 +475,7 @@
 #define DBCR1_IAC34MX  0x00C0  /* Instr Addr 3-4 range eXclusive */
 #define DBCR1_IAC34AT  0x0001  /* Instr Addr 3-4 range Toggle */
 
-#define dbcr_iac_range(task)   ((task)-thread.dbcr1)
+#define dbcr_iac_range(task)   ((task)-thread.debug.dbcr1)
 #define DBCR_IAC12IDBCR1_IAC12M/* Range Inclusive */
 #define DBCR_IAC12XDBCR1_IAC12MX   /* Range Exclusive */
 #define DBCR_IAC12MODE DBCR1_IAC12MX   /* IAC 1-2 Mode Bits */
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index c7e8afc..d56727c 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -113,7 +113,7 @@ int main(void)
 #endif /* CONFIG_SPE */
 #endif /* CONFIG_PPC64 */
 #if defined(CONFIG_4xx) || defined(CONFIG_BOOKE

[PATCH 2/4 v6] KVM: PPC: exit to user space on ehpriv 1 instruction

2013-07-04 Thread Bharat Bhushan

ehpriv 1 instruction is used for setting software breakpoints
by user space. This patch adds support to exit to user space
with run-debug have relevant information.

As this is the first point we are using run-debug, also defined
the run-debug structure.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v5-v6
 - using ehpriv 1 instread of ehpriv for software breakpoint

 arch/powerpc/include/asm/disassemble.h |4 
 arch/powerpc/include/asm/kvm_booke.h   |7 ++-
 arch/powerpc/include/uapi/asm/kvm.h|   21 +
 arch/powerpc/kvm/booke.c   |2 +-
 arch/powerpc/kvm/e500_emulate.c|   26 ++
 5 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/disassemble.h 
b/arch/powerpc/include/asm/disassemble.h
index 9b198d1..856f8de 100644
--- a/arch/powerpc/include/asm/disassemble.h
+++ b/arch/powerpc/include/asm/disassemble.h
@@ -77,4 +77,8 @@ static inline unsigned int get_d(u32 inst)
return inst  0x;
 }
 
+static inline unsigned int get_oc(u32 inst)
+{
+   return (inst  11)  0x7fff;
+}
 #endif /* __ASM_PPC_DISASSEMBLE_H__ */
diff --git a/arch/powerpc/include/asm/kvm_booke.h 
b/arch/powerpc/include/asm/kvm_booke.h
index d3c1eb3..dd8f615 100644
--- a/arch/powerpc/include/asm/kvm_booke.h
+++ b/arch/powerpc/include/asm/kvm_booke.h
@@ -26,7 +26,12 @@
 /* LPIDs we support with this build -- runtime limit may be lower */
 #define KVMPPC_NR_LPIDS64
 
-#define KVMPPC_INST_EHPRIV 0x7c00021c
+#define KVMPPC_INST_EHPRIV 0x7c00021c
+#define EHPRIV_OC_SHIFT11
+/* ehpriv 1 : ehpriv with OC = 1 is used for debug emulation */
+#define EHPRIV_OC_DEBUG1
+#define KVMPPC_INST_EHPRIV_DEBUG   (KVMPPC_INST_EHPRIV | \
+(EHPRIV_OC_DEBUG  EHPRIV_OC_SHIFT))
 
 static inline void kvmppc_set_gpr(struct kvm_vcpu *vcpu, int num, ulong val)
 {
diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
b/arch/powerpc/include/uapi/asm/kvm.h
index 0fb1a6e..ded0607 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -269,7 +269,24 @@ struct kvm_fpu {
__u64 fpr[32];
 };
 
+/*
+ * Defines for h/w breakpoint, watchpoint (read, write or both) and
+ * software breakpoint.
+ * These are used as type in KVM_SET_GUEST_DEBUG ioctl and status
+ * for KVM_DEBUG_EXIT.
+ */
+#define KVMPPC_DEBUG_NONE  0x0
+#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
+#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
+#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
 struct kvm_debug_exit_arch {
+   __u64 address;
+   /*
+* exiting to userspace because of h/w breakpoint, watchpoint
+* (read, write or both) and software breakpoint.
+*/
+   __u32 status;
+   __u32 reserved;
 };
 
 /* for KVM_SET_GUEST_DEBUG */
@@ -281,10 +298,6 @@ struct kvm_guest_debug_arch {
 * Type denotes h/w breakpoint, read watchpoint, write
 * watchpoint or watchpoint (both read and write).
 */
-#define KVMPPC_DEBUG_NONE  0x0
-#define KVMPPC_DEBUG_BREAKPOINT(1UL  1)
-#define KVMPPC_DEBUG_WATCH_WRITE   (1UL  2)
-#define KVMPPC_DEBUG_WATCH_READ(1UL  3)
__u32 type;
__u32 reserved;
} bp[16];
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 62d4ece..4c9f6ad 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1460,7 +1460,7 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
val = get_reg_val(reg-id, vcpu-arch.tsr);
break;
case KVM_REG_PPC_DEBUG_INST:
-   val = get_reg_val(reg-id, KVMPPC_INST_EHPRIV);
+   val = get_reg_val(reg-id, KVMPPC_INST_EHPRIV_DEBUG);
break;
default:
r = kvmppc_get_one_reg(vcpu, reg-id, val);
diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c
index b10a012..6163a03 100644
--- a/arch/powerpc/kvm/e500_emulate.c
+++ b/arch/powerpc/kvm/e500_emulate.c
@@ -26,6 +26,7 @@
 #define XOP_TLBRE   946
 #define XOP_TLBWE   978
 #define XOP_TLBILX  18
+#define XOP_EHPRIV  270
 
 #ifdef CONFIG_KVM_E500MC
 static int dbell2prio(ulong param)
@@ -82,6 +83,26 @@ static int kvmppc_e500_emul_msgsnd(struct kvm_vcpu *vcpu, 
int rb)
 }
 #endif
 
+static int kvmppc_e500_emul_ehpriv(struct kvm_run *run, struct kvm_vcpu *vcpu,
+  unsigned int inst, int *advance)
+{
+   int emulated = EMULATE_DONE;
+
+   switch (get_oc(inst)) {
+   case EHPRIV_OC_DEBUG:
+   run-exit_reason = KVM_EXIT_DEBUG;
+   run-debug.arch.address = vcpu-arch.pc;
+   run-debug.arch.status = 0;
+   kvmppc_account_exit(vcpu, DEBUG_EXITS

[PATCH 1/4 v6] powerpc: export debug registers save function for KVM

2013-07-04 Thread Bharat Bhushan

KVM need this function when switching from vcpu to user-space
thread. My subsequent patch will use this function.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v5-v6
 - switch_booke_debug_regs() not guarded by the compiler switch

 arch/powerpc/include/asm/switch_to.h |1 +
 arch/powerpc/kernel/process.c|3 ++-
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/switch_to.h 
b/arch/powerpc/include/asm/switch_to.h
index 200d763..db68f1d 100644
--- a/arch/powerpc/include/asm/switch_to.h
+++ b/arch/powerpc/include/asm/switch_to.h
@@ -29,6 +29,7 @@ extern void giveup_vsx(struct task_struct *);
 extern void enable_kernel_spe(void);
 extern void giveup_spe(struct task_struct *);
 extern void load_up_spe(struct task_struct *);
+extern void switch_booke_debug_regs(struct thread_struct *new_thread);
 
 #ifndef CONFIG_SMP
 extern void discard_lazy_cpu_state(void);
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 01ff496..da586aa 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -362,12 +362,13 @@ static void prime_debug_regs(struct thread_struct *thread)
  * debug registers, set the debug registers from the values
  * stored in the new thread.
  */
-static void switch_booke_debug_regs(struct thread_struct *new_thread)
+void switch_booke_debug_regs(struct thread_struct *new_thread)
 {
if ((current-thread.debug.dbcr0  DBCR0_IDM)
|| (new_thread-debug.dbcr0  DBCR0_IDM))
prime_debug_regs(new_thread);
 }
+EXPORT_SYMBOL_GPL(switch_booke_debug_regs);
 #else  /* !CONFIG_PPC_ADV_DEBUG_REGS */
 #ifndef CONFIG_HAVE_HW_BREAKPOINT
 static void set_debug_reg_defaults(struct thread_struct *thread)
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 3/4 v6] KVM: PPC: Using struct debug_reg

2013-07-04 Thread Bharat Bhushan

For KVM also use the struct debug_reg defined in asm/processor.h

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v5-v6
 - no changes

 arch/powerpc/include/asm/kvm_host.h |   13 +
 arch/powerpc/kvm/booke.c|   34 --
 2 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index af326cd..838a577 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -381,17 +381,6 @@ struct kvmppc_slb {
 #define KVMPPC_EPR_USER1 /* exit to userspace to fill EPR */
 #define KVMPPC_EPR_KERNEL  2 /* in-kernel irqchip */
 
-struct kvmppc_booke_debug_reg {
-   u32 dbcr0;
-   u32 dbcr1;
-   u32 dbcr2;
-#ifdef CONFIG_KVM_E500MC
-   u32 dbcr4;
-#endif
-   u64 iac[KVMPPC_BOOKE_MAX_IAC];
-   u64 dac[KVMPPC_BOOKE_MAX_DAC];
-};
-
 #define KVMPPC_IRQ_DEFAULT 0
 #define KVMPPC_IRQ_MPIC1
 #define KVMPPC_IRQ_XICS2
@@ -535,7 +524,7 @@ struct kvm_vcpu_arch {
u32 eptcfg;
u32 epr;
u32 crit_save;
-   struct kvmppc_booke_debug_reg dbg_reg;
+   struct debug_reg dbg_reg;
 #endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 4c9f6ad..87aa727 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -1424,7 +1424,6 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
int r = 0;
union kvmppc_one_reg val;
int size;
-   long int i;
 
size = one_reg_size(reg-id);
if (size  sizeof(val))
@@ -1432,16 +1431,24 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
 
switch (reg-id) {
case KVM_REG_PPC_IAC1:
+   val = get_reg_val(reg-id, vcpu-arch.dbg_reg.iac1);
+   break;
case KVM_REG_PPC_IAC2:
+   val = get_reg_val(reg-id, vcpu-arch.dbg_reg.iac2);
+   break;
+#if CONFIG_PPC_ADV_DEBUG_IACS  2
case KVM_REG_PPC_IAC3:
+   val = get_reg_val(reg-id, vcpu-arch.dbg_reg.iac3);
+   break;
case KVM_REG_PPC_IAC4:
-   i = reg-id - KVM_REG_PPC_IAC1;
-   val = get_reg_val(reg-id, vcpu-arch.dbg_reg.iac[i]);
+   val = get_reg_val(reg-id, vcpu-arch.dbg_reg.iac4);
break;
+#endif
case KVM_REG_PPC_DAC1:
+   val = get_reg_val(reg-id, vcpu-arch.dbg_reg.dac1);
+   break;
case KVM_REG_PPC_DAC2:
-   i = reg-id - KVM_REG_PPC_DAC1;
-   val = get_reg_val(reg-id, vcpu-arch.dbg_reg.dac[i]);
+   val = get_reg_val(reg-id, vcpu-arch.dbg_reg.dac2);
break;
case KVM_REG_PPC_EPR: {
u32 epr = get_guest_epr(vcpu);
@@ -1481,7 +1488,6 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
int r = 0;
union kvmppc_one_reg val;
int size;
-   long int i;
 
size = one_reg_size(reg-id);
if (size  sizeof(val))
@@ -1492,16 +1498,24 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
 
switch (reg-id) {
case KVM_REG_PPC_IAC1:
+   vcpu-arch.dbg_reg.iac1 = set_reg_val(reg-id, val);
+   break;
case KVM_REG_PPC_IAC2:
+   vcpu-arch.dbg_reg.iac2 = set_reg_val(reg-id, val);
+   break;
+#if CONFIG_PPC_ADV_DEBUG_IACS  2
case KVM_REG_PPC_IAC3:
+   vcpu-arch.dbg_reg.iac3 = set_reg_val(reg-id, val);
+   break;
case KVM_REG_PPC_IAC4:
-   i = reg-id - KVM_REG_PPC_IAC1;
-   vcpu-arch.dbg_reg.iac[i] = set_reg_val(reg-id, val);
+   vcpu-arch.dbg_reg.iac4 = set_reg_val(reg-id, val);
break;
+#endif
case KVM_REG_PPC_DAC1:
+   vcpu-arch.dbg_reg.dac1 = set_reg_val(reg-id, val);
+   break;
case KVM_REG_PPC_DAC2:
-   i = reg-id - KVM_REG_PPC_DAC1;
-   vcpu-arch.dbg_reg.dac[i] = set_reg_val(reg-id, val);
+   vcpu-arch.dbg_reg.dac2 = set_reg_val(reg-id, val);
break;
case KVM_REG_PPC_EPR: {
u32 new_epr = set_reg_val(reg-id, val);
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 4/4 v6] KVM: PPC: Add userspace debug stub support

2013-07-04 Thread Bharat Bhushan

This patch adds the debug stub support on booke/bookehv.
Now QEMU debug stub can use hw breakpoint, watchpoint and
software breakpoint to debug guest.

This is how we save/restore debug register context when switching
between guest, userspace and kernel user-process:

When QEMU is running
 - thread-debug_reg == QEMU debug register context.
 - Kernel will handle switching the debug register on context switch.
 - no vcpu_load() called

QEMU makes ioctls (except RUN)
 - This will call vcpu_load()
 - should not change context.
 - Some ioctls can change vcpu debug register, context saved in 
vcpu-debug_regs

QEMU Makes RUN ioctl
 - Save thread-debug_reg on STACK
 - Store thread-debug_reg == vcpu-debug_reg
 - load thread-debug_reg
 - RUN VCPU ( So thread points to vcpu context )

Context switch happens When VCPU running
 - makes vcpu_load() should not load any context
 - kernel loads the vcpu context as thread-debug_regs points to vcpu context.

On heavyweight_exit
 - Load the context saved on stack in thread-debug_reg

Currently we do not support debug resource emulation to guest,
On debug exception, always exit to user space irrespective of
user space is expecting the debug exception or not. If this is
unexpected exception (breakpoint/watchpoint event not set by
userspace) then let us leave the action on user space. This
is similar to what it was before, only thing is that now we
have proper exit state available to user space.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
v5-v6
 - no changes

 arch/powerpc/include/asm/kvm_host.h |3 +
 arch/powerpc/include/uapi/asm/kvm.h |1 +
 arch/powerpc/kvm/booke.c|  239 ---
 arch/powerpc/kvm/booke.h|5 +
 4 files changed, 230 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 838a577..aeb490d 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -524,7 +524,10 @@ struct kvm_vcpu_arch {
u32 eptcfg;
u32 epr;
u32 crit_save;
+   /* guest debug registers*/
struct debug_reg dbg_reg;
+   /* hardware visible debug registers when in guest state */
+   struct debug_reg shadow_dbg_reg;
 #endif
gpa_t paddr_accessed;
gva_t vaddr_accessed;
diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
b/arch/powerpc/include/uapi/asm/kvm.h
index ded0607..f5077c2 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -27,6 +27,7 @@
 #define __KVM_HAVE_PPC_SMT
 #define __KVM_HAVE_IRQCHIP
 #define __KVM_HAVE_IRQ_LINE
+#define __KVM_HAVE_GUEST_DEBUG
 
 struct kvm_regs {
__u64 pc;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 87aa727..7b54802 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -133,6 +133,29 @@ static void kvmppc_vcpu_sync_fpu(struct kvm_vcpu *vcpu)
 #endif
 }
 
+static void kvmppc_vcpu_sync_debug(struct kvm_vcpu *vcpu)
+{
+   /* Synchronize guest's desire to get debug interrupts into shadow MSR */
+#ifndef CONFIG_KVM_BOOKE_HV
+   vcpu-arch.shadow_msr = ~MSR_DE;
+   vcpu-arch.shadow_msr |= vcpu-arch.shared-msr  MSR_DE;
+#endif
+
+   /* Force enable debug interrupts when user space wants to debug */
+   if (vcpu-guest_debug) {
+#ifdef CONFIG_KVM_BOOKE_HV
+   /*
+* Since there is no shadow MSR, sync MSR_DE into the guest
+* visible MSR.
+*/
+   vcpu-arch.shared-msr |= MSR_DE;
+#else
+   vcpu-arch.shadow_msr |= MSR_DE;
+   vcpu-arch.shared-msr = ~MSR_DE;
+#endif
+   }
+}
+
 /*
  * Helper function for full MSR writes.  No need to call this if only
  * EE/CE/ME/DE/RI are changing.
@@ -150,6 +173,7 @@ void kvmppc_set_msr(struct kvm_vcpu *vcpu, u32 new_msr)
kvmppc_mmu_msr_notify(vcpu, old_msr);
kvmppc_vcpu_sync_spe(vcpu);
kvmppc_vcpu_sync_fpu(vcpu);
+   kvmppc_vcpu_sync_debug(vcpu);
 }
 
 static void kvmppc_booke_queue_irqprio(struct kvm_vcpu *vcpu,
@@ -655,6 +679,7 @@ int kvmppc_core_check_requests(struct kvm_vcpu *vcpu)
 int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu)
 {
int ret, s;
+   struct thread_struct thread;
 #ifdef CONFIG_PPC_FPU
unsigned int fpscr;
int fpexc_mode;
@@ -698,12 +723,21 @@ int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct 
kvm_vcpu *vcpu)
 
kvmppc_load_guest_fp(vcpu);
 #endif
+   /* Switch to guest debug context */
+   thread.debug = vcpu-arch.shadow_dbg_reg;
+   switch_booke_debug_regs(thread);
+   thread.debug = current-thread.debug;
+   current-thread.debug = vcpu-arch.shadow_dbg_reg;
 
ret = __kvmppc_vcpu_run(kvm_run, vcpu);
 
/* No need for kvm_guest_exit. It's done in handle_exit.
   We also get here with interrupts enabled. */
 
+   /* Switch back to user space debug

1 2 >

1 - 100 of 130 matches

Mail list logo