date:20170112

Re: [PATCH] powerpc: Use octal numbers for file permissions

2017-01-12 Thread Cyril Bur

On Thu, 2017-01-12 at 14:54 +1100, Russell Currey wrote:
> Symbolic macros are unintuitive and hard to read, whereas octal constants
> are much easier to interpret.  Replace macros for the basic permission
> flags (user/group/other read/write/execute) with numeric constants
> instead, across the whole powerpc tree.
> 
> Introducing a significant number of changes across the tree for no runtime
> benefit isn't exactly desirable, but so long as these macros are still
> used in the tree people will keep sending patches that add them.  Not only
> are they hard to parse at a glance, there are multiple ways of coming to
> the same value (as you can see with 0444 and 0644 in this patch) which
> hurts readability.
> 
> Signed-off-by: Russell Currey 

Reviewed-by: Cyril Bur 

> ---
> I wondered what a "S_IRUGO" was and subsequently found the following:
>   https://lwn.net/Articles/696229/
> so I figured making numeric constants standard across the tree was a good
> idea instead of the mix we currently have.
> 
> For new patches that come in, checkpatch warns when using something like
> S_IRUGO and tells you to use something like 0444 instead.

I wonder if both these points shouldn't actually be in the commit
message?

> ---
>  arch/powerpc/kernel/eeh_sysfs.c|  2 +-
>  arch/powerpc/kernel/iommu.c|  2 +-
>  arch/powerpc/kernel/proc_powerpc.c |  2 +-
>  arch/powerpc/kernel/rtas-proc.c| 14 +++---
>  arch/powerpc/kernel/rtas_flash.c   |  2 +-
>  arch/powerpc/kernel/rtasd.c|  2 +-
>  arch/powerpc/kernel/traps.c|  4 ++--
>  arch/powerpc/kvm/book3s_hv.c   | 10 --
>  arch/powerpc/kvm/book3s_xics.c |  2 +-
>  arch/powerpc/platforms/83xx/mcu_mpc8349emitx.c |  2 +-
>  arch/powerpc/platforms/cell/spufs/inode.c  |  4 ++--
>  arch/powerpc/platforms/powernv/opal-dump.c |  4 ++--
>  arch/powerpc/platforms/powernv/opal-elog.c |  4 ++--
>  arch/powerpc/platforms/powernv/opal-sysparam.c |  6 +++---
>  arch/powerpc/platforms/pseries/cmm.c   | 16 
>  arch/powerpc/platforms/pseries/dlpar.c |  2 +-
>  arch/powerpc/platforms/pseries/hvCall_inst.c   |  2 +-
>  arch/powerpc/platforms/pseries/ibmebus.c   |  4 ++--
>  arch/powerpc/platforms/pseries/lparcfg.c   |  4 ++--
>  arch/powerpc/platforms/pseries/mobility.c  |  4 ++--
>  arch/powerpc/platforms/pseries/reconfig.c  |  2 +-
>  arch/powerpc/platforms/pseries/scanlog.c   |  2 +-
>  arch/powerpc/platforms/pseries/suspend.c   |  3 +--
>  arch/powerpc/platforms/pseries/vio.c   |  8 
>  arch/powerpc/sysdev/axonram.c  |  2 +-
>  arch/powerpc/sysdev/mv64x60_pci.c  |  2 +-
>  26 files changed, 54 insertions(+), 57 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/eeh_sysfs.c b/arch/powerpc/kernel/eeh_sysfs.c
> index 1ceecdda810b..f6d2d5b66907 100644
> --- a/arch/powerpc/kernel/eeh_sysfs.c
> +++ b/arch/powerpc/kernel/eeh_sysfs.c
> @@ -48,7 +48,7 @@ static ssize_t eeh_show_##_name(struct device *dev,  \
> \
>   return sprintf(buf, _format "\n", edev->_memb);   \
>  }\
> -static DEVICE_ATTR(_name, S_IRUGO, eeh_show_##_name, NULL);
> +static DEVICE_ATTR(_name, 0444, eeh_show_##_name, NULL);
>  
>  EEH_SHOW_ATTR(eeh_mode,mode,"0x%x");
>  EEH_SHOW_ATTR(eeh_config_addr, config_addr, "0x%x");
> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index 5f202a566ec5..7ef279a0888e 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -127,7 +127,7 @@ static ssize_t fail_iommu_store(struct device *dev,
>   return count;
>  }
>  
> -static DEVICE_ATTR(fail_iommu, S_IRUGO|S_IWUSR, fail_iommu_show,
> +static DEVICE_ATTR(fail_iommu, 0644, fail_iommu_show,
>  fail_iommu_store);
>  
>  static int fail_iommu_bus_notify(struct notifier_block *nb,
> diff --git a/arch/powerpc/kernel/proc_powerpc.c 
> b/arch/powerpc/kernel/proc_powerpc.c
> index 56548bf6231f..9bfbd800d32f 100644
> --- a/arch/powerpc/kernel/proc_powerpc.c
> +++ b/arch/powerpc/kernel/proc_powerpc.c
> @@ -63,7 +63,7 @@ static int __init proc_ppc64_init(void)
>  {
>   struct proc_dir_entry *pde;
>  
> - pde = proc_create_data("powerpc/systemcfg", S_IFREG|S_IRUGO, NULL,
> + pde = proc_create_data("powerpc/systemcfg", S_IFREG | 0444, NULL,
>  _map_fops, vdso_data);
>   if (!pde)
>   return 1;
> diff --git a/arch/powerpc/kernel/rtas-proc.c b/arch/powerpc/kernel/rtas-proc.c
> index df56dfc4b681..bb5e8cbcc553 100644
> --- a/arch/powerpc/kernel/rtas-proc.c
> +++ b/arch/powerpc/kernel/rtas-proc.c
> @@ -260,19 +260,19 @@ static int __init proc_rtas_init(void)
>   if

Re: [PATCH v4 1/2] KVM: PPC: Add new capability to control MCE behaviour

2017-01-12 Thread Aravinda Prasad



On Thursday 12 January 2017 03:26 PM, Paul Mackerras wrote:
> On Mon, Jan 09, 2017 at 05:10:35PM +0530, Aravinda Prasad wrote:
>> This patch introduces a new KVM capability to control
>> how KVM behaves on machine check exception (MCE).
>> Without this capability, KVM redirects machine check
>> exceptions to guest's 0x200 vector, if the address in
>> error belongs to the guest. With this capability KVM
>> causes a guest exit with NMI exit reason.
>>
>> The new capability is required to avoid problems if
>> a new kernel/KVM is used with an old QEMU for guests
>> that don't issue "ibm,nmi-register". As old QEMU does
>> not understand the NMI exit type, it treats it as a
>> fatal error. However, the guest could have handled
>> the machine check error if the exception was delivered
>> to guest's 0x200 interrupt vector instead of NMI exit
>> in case of old QEMU.
> 
> You need to add a description of the new capability to
> Documentation/virtual/kvm/api.txt.

sure.

> 
> Paul.
> 

-- 
Regards,
Aravinda

[PATCH] powerpc: opal-msglog: Report size of memcons log

2017-01-12 Thread Joel Stanley

The OPAL memory console is reported to be size zero, as we do not
initialise the struct attr with any size information due to the size
being variable. This leads users to think that the console is empty.

Instead report the maximum size.

Signed-off-by: Joel Stanley 
---
 arch/powerpc/platforms/powernv/opal-msglog.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/opal-msglog.c 
b/arch/powerpc/platforms/powernv/opal-msglog.c
index 39d6ff9e5630..8486b2ceb510 100644
--- a/arch/powerpc/platforms/powernv/opal-msglog.c
+++ b/arch/powerpc/platforms/powernv/opal-msglog.c
@@ -123,6 +123,10 @@ void __init opal_msglog_init(void)
return;
}
 
+   /* Report maximum size */
+   opal_msglog_attr.size =  be64_to_cpu(mc->ibuf_size) +
+   be64_to_cpu(mc->obuf_size);
+
opal_memcons = mc;
 }
 
-- 
2.11.0

Re: [PATCH v5 2/5] powernv:stop: Uniformly rename power9 to arch300

2017-01-12 Thread Oliver O'Halloran

On Fri, Jan 13, 2017 at 2:44 PM, Gautham R Shenoy
 wrote:
> On Thu, Jan 12, 2017 at 03:17:33PM +0530, Balbir Singh wrote:
>> On Tue, Jan 10, 2017 at 02:37:01PM +0530, Gautham R. Shenoy wrote:
>> > From: "Gautham R. Shenoy" 
>> >
>> > Balbir pointed out that in idle_book3s.S and powernv/idle.c some
>> > functions and variables had power9 in their names while some others
>> > had arch300.
>> >
>>
>> I would prefer power9 to arch300
>>
>
>
> I don't have a strong preference for arch300 vs power9, will change it
> to power9 if that looks better.

Personally I think we should be as descriptive as possible and use
power_9_arch_300_the_bikeshed_is_red_dammit.

Oliver

Re: [PATCH v5 2/5] powernv:stop: Uniformly rename power9 to arch300

2017-01-12 Thread Gautham R Shenoy

On Thu, Jan 12, 2017 at 03:17:33PM +0530, Balbir Singh wrote:
> On Tue, Jan 10, 2017 at 02:37:01PM +0530, Gautham R. Shenoy wrote:
> > From: "Gautham R. Shenoy" 
> > 
> > Balbir pointed out that in idle_book3s.S and powernv/idle.c some
> > functions and variables had power9 in their names while some others
> > had arch300.
> >
> 
> I would prefer power9 to arch300
>


I don't have a strong preference for arch300 vs power9, will change it
to power9 if that looks better.

--
Thanks and Regards
gautham.

Re: [PATCH v2 7/7] uapi: export all headers under uapi directories

2017-01-12 Thread Jeff Epler

On Thu, Jan 12, 2017 at 05:32:09PM +0100, Nicolas Dichtel wrote:
> What I was trying to say is that I export those directories like other are.
> Removing those files is not related to that series.

Perhaps the correct solution is to only copy files matching "*.h" to
reduce the risk of copying files incidentally created by kbuild but
which shouldn't be installed as uapi headers.

jeff

Re: [PATCH kernel v3] KVM: PPC: Add in-kernel acceleration for VFIO

2017-01-12 Thread David Gibson

On Fri, Jan 13, 2017 at 01:23:46PM +1100, Alexey Kardashevskiy wrote:
> On 13/01/17 10:53, David Gibson wrote:
> > On Thu, Jan 12, 2017 at 07:09:01PM +1100, Alexey Kardashevskiy wrote:
> >> On 12/01/17 16:04, David Gibson wrote:
> >>> On Tue, Dec 20, 2016 at 05:52:29PM +1100, Alexey Kardashevskiy wrote:
>  This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT
>  and H_STUFF_TCE requests targeted an IOMMU TCE table used for VFIO
>  without passing them to user space which saves time on switching
>  to user space and back.
> 
>  This adds H_PUT_TCE/H_PUT_TCE_INDIRECT/H_STUFF_TCE handlers to KVM.
>  KVM tries to handle a TCE request in the real mode, if failed
>  it passes the request to the virtual mode to complete the operation.
>  If it a virtual mode handler fails, the request is passed to
>  the user space; this is not expected to happen though.
> 
>  To avoid dealing with page use counters (which is tricky in real mode),
>  this only accelerates SPAPR TCE IOMMU v2 clients which are required
>  to pre-register the userspace memory. The very first TCE request will
>  be handled in the VFIO SPAPR TCE driver anyway as the userspace view
>  of the TCE table (iommu_table::it_userspace) is not allocated till
>  the very first mapping happens and we cannot call vmalloc in real mode.
> 
>  This adds new attribute - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE - to
>  the VFIO KVM device. It takes a VFIO group fd and SPAPR TCE table fd
>  and associates a physical IOMMU table with the SPAPR TCE table (which
>  is a guest view of the hardware IOMMU table). The iommu_table object
>  is cached and referenced so we do not have to look up for it in real 
>  mode.
> 
>  This does not implement the UNSET counterpart as there is no use for it -
>  once the acceleration is enabled, the existing userspace won't
>  disable it unless a VFIO container is destroyed; this adds necessary
>  cleanup to the KVM_DEV_VFIO_GROUP_DEL handler.
> 
>  As this creates a descriptor per IOMMU table-LIOBN couple (called
>  kvmppc_spapr_tce_iommu_table), it is possible to have several
>  descriptors with the same iommu_table (hardware IOMMU table) attached
>  to the same LIOBN, this is done to simplify the cleanup and can be
>  improved later.
> 
>  This advertises the new KVM_CAP_SPAPR_TCE_VFIO capability to the user
>  space.
> 
>  This finally makes use of vfio_external_user_iommu_id() which was
>  introduced quite some time ago and was considered for removal.
> 
>  Tests show that this patch increases transmission speed from 220MB/s
>  to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card).
> 
>  Signed-off-by: Alexey Kardashevskiy 
>  ---
>  Changes:
>  v3:
>  * simplified not to use VFIO group notifiers
>  * reworked cleanup, should be cleaner/simpler now
> 
>  v2:
>  * reworked to use new VFIO notifiers
>  * now same iommu_table may appear in the list several times, to be fixed 
>  later
>  ---
> 
>  This obsoletes:
> 
>  [PATCH kernel v2 08/11] KVM: PPC: Pass kvm* to kvmppc_find_table()
>  [PATCH kernel v2 09/11] vfio iommu: Add helpers to (un)register blocking 
>  notifiers per group
>  [PATCH kernel v2 11/11] KVM: PPC: Add in-kernel acceleration for VFIO
> 
> 
>  So I have not reposted the whole thing, should have I?
> 
> 
>  btw "F: virt/kvm/vfio.*"  is missing MAINTAINERS.
> 
> 
>  ---
>   Documentation/virtual/kvm/devices/vfio.txt |  22 ++-
>   arch/powerpc/include/asm/kvm_host.h|   8 +
>   arch/powerpc/include/asm/kvm_ppc.h |   4 +
>   include/uapi/linux/kvm.h   |   8 +
>   arch/powerpc/kvm/book3s_64_vio.c   | 286 
>  +
>   arch/powerpc/kvm/book3s_64_vio_hv.c| 178 ++
>   arch/powerpc/kvm/powerpc.c |   2 +
>   virt/kvm/vfio.c|  88 +
>   8 files changed, 594 insertions(+), 2 deletions(-)
> 
>  diff --git a/Documentation/virtual/kvm/devices/vfio.txt 
>  b/Documentation/virtual/kvm/devices/vfio.txt
>  index ef51740c67ca..f95d867168ea 100644
>  --- a/Documentation/virtual/kvm/devices/vfio.txt
>  +++ b/Documentation/virtual/kvm/devices/vfio.txt
>  @@ -16,7 +16,25 @@ Groups:
>   
>   KVM_DEV_VFIO_GROUP attributes:
> KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking
>  +kvm_device_attr.addr points to an int32_t file descriptor
>  +for the VFIO group.
> KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device 
>  tracking
>  +kvm_device_attr.addr points to an int32_t file descriptor
>  +for the VFIO group.

Re: [PATCH kernel v3] KVM: PPC: Add in-kernel acceleration for VFIO

2017-01-12 Thread Alexey Kardashevskiy

On 13/01/17 10:53, David Gibson wrote:
> On Thu, Jan 12, 2017 at 07:09:01PM +1100, Alexey Kardashevskiy wrote:
>> On 12/01/17 16:04, David Gibson wrote:
>>> On Tue, Dec 20, 2016 at 05:52:29PM +1100, Alexey Kardashevskiy wrote:
 This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT
 and H_STUFF_TCE requests targeted an IOMMU TCE table used for VFIO
 without passing them to user space which saves time on switching
 to user space and back.

 This adds H_PUT_TCE/H_PUT_TCE_INDIRECT/H_STUFF_TCE handlers to KVM.
 KVM tries to handle a TCE request in the real mode, if failed
 it passes the request to the virtual mode to complete the operation.
 If it a virtual mode handler fails, the request is passed to
 the user space; this is not expected to happen though.

 To avoid dealing with page use counters (which is tricky in real mode),
 this only accelerates SPAPR TCE IOMMU v2 clients which are required
 to pre-register the userspace memory. The very first TCE request will
 be handled in the VFIO SPAPR TCE driver anyway as the userspace view
 of the TCE table (iommu_table::it_userspace) is not allocated till
 the very first mapping happens and we cannot call vmalloc in real mode.

 This adds new attribute - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE - to
 the VFIO KVM device. It takes a VFIO group fd and SPAPR TCE table fd
 and associates a physical IOMMU table with the SPAPR TCE table (which
 is a guest view of the hardware IOMMU table). The iommu_table object
 is cached and referenced so we do not have to look up for it in real mode.

 This does not implement the UNSET counterpart as there is no use for it -
 once the acceleration is enabled, the existing userspace won't
 disable it unless a VFIO container is destroyed; this adds necessary
 cleanup to the KVM_DEV_VFIO_GROUP_DEL handler.

 As this creates a descriptor per IOMMU table-LIOBN couple (called
 kvmppc_spapr_tce_iommu_table), it is possible to have several
 descriptors with the same iommu_table (hardware IOMMU table) attached
 to the same LIOBN, this is done to simplify the cleanup and can be
 improved later.

 This advertises the new KVM_CAP_SPAPR_TCE_VFIO capability to the user
 space.

 This finally makes use of vfio_external_user_iommu_id() which was
 introduced quite some time ago and was considered for removal.

 Tests show that this patch increases transmission speed from 220MB/s
 to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card).

 Signed-off-by: Alexey Kardashevskiy 
 ---
 Changes:
 v3:
 * simplified not to use VFIO group notifiers
 * reworked cleanup, should be cleaner/simpler now

 v2:
 * reworked to use new VFIO notifiers
 * now same iommu_table may appear in the list several times, to be fixed 
 later
 ---

 This obsoletes:

 [PATCH kernel v2 08/11] KVM: PPC: Pass kvm* to kvmppc_find_table()
 [PATCH kernel v2 09/11] vfio iommu: Add helpers to (un)register blocking 
 notifiers per group
 [PATCH kernel v2 11/11] KVM: PPC: Add in-kernel acceleration for VFIO

 So I have not reposted the whole thing, should have I?

 btw "F: virt/kvm/vfio.*"  is missing MAINTAINERS.

 ---
  Documentation/virtual/kvm/devices/vfio.txt |  22 ++-
  arch/powerpc/include/asm/kvm_host.h|   8 +
  arch/powerpc/include/asm/kvm_ppc.h |   4 +
  include/uapi/linux/kvm.h   |   8 +
  arch/powerpc/kvm/book3s_64_vio.c   | 286 
 +
  arch/powerpc/kvm/book3s_64_vio_hv.c| 178 ++
  arch/powerpc/kvm/powerpc.c |   2 +
  virt/kvm/vfio.c|  88 +
  8 files changed, 594 insertions(+), 2 deletions(-)

 diff --git a/Documentation/virtual/kvm/devices/vfio.txt 
 b/Documentation/virtual/kvm/devices/vfio.txt
 index ef51740c67ca..f95d867168ea 100644
 --- a/Documentation/virtual/kvm/devices/vfio.txt
 +++ b/Documentation/virtual/kvm/devices/vfio.txt
 @@ -16,7 +16,25 @@ Groups:

  KVM_DEV_VFIO_GROUP attributes:
KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking
 +  kvm_device_attr.addr points to an int32_t file descriptor
 +  for the VFIO group.
KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device 
 tracking
 +  kvm_device_attr.addr points to an int32_t file descriptor
 +  for the VFIO group.
 +  KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table
 +  allocated by sPAPR KVM.
 +  kvm_device_attr.addr points to a struct:

 -For each, kvm_device_attr.addr points to an int32_t file descriptor
 -for the VFIO group.
 +  struct kvm_vfio_spapr_tce {

Re: [PATCH kernel v3] KVM: PPC: Add in-kernel acceleration for VFIO

2017-01-12 Thread David Gibson

On Thu, Jan 12, 2017 at 07:09:01PM +1100, Alexey Kardashevskiy wrote:
> On 12/01/17 16:04, David Gibson wrote:
> > On Tue, Dec 20, 2016 at 05:52:29PM +1100, Alexey Kardashevskiy wrote:
> >> This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT
> >> and H_STUFF_TCE requests targeted an IOMMU TCE table used for VFIO
> >> without passing them to user space which saves time on switching
> >> to user space and back.
> >>
> >> This adds H_PUT_TCE/H_PUT_TCE_INDIRECT/H_STUFF_TCE handlers to KVM.
> >> KVM tries to handle a TCE request in the real mode, if failed
> >> it passes the request to the virtual mode to complete the operation.
> >> If it a virtual mode handler fails, the request is passed to
> >> the user space; this is not expected to happen though.
> >>
> >> To avoid dealing with page use counters (which is tricky in real mode),
> >> this only accelerates SPAPR TCE IOMMU v2 clients which are required
> >> to pre-register the userspace memory. The very first TCE request will
> >> be handled in the VFIO SPAPR TCE driver anyway as the userspace view
> >> of the TCE table (iommu_table::it_userspace) is not allocated till
> >> the very first mapping happens and we cannot call vmalloc in real mode.
> >>
> >> This adds new attribute - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE - to
> >> the VFIO KVM device. It takes a VFIO group fd and SPAPR TCE table fd
> >> and associates a physical IOMMU table with the SPAPR TCE table (which
> >> is a guest view of the hardware IOMMU table). The iommu_table object
> >> is cached and referenced so we do not have to look up for it in real mode.
> >>
> >> This does not implement the UNSET counterpart as there is no use for it -
> >> once the acceleration is enabled, the existing userspace won't
> >> disable it unless a VFIO container is destroyed; this adds necessary
> >> cleanup to the KVM_DEV_VFIO_GROUP_DEL handler.
> >>
> >> As this creates a descriptor per IOMMU table-LIOBN couple (called
> >> kvmppc_spapr_tce_iommu_table), it is possible to have several
> >> descriptors with the same iommu_table (hardware IOMMU table) attached
> >> to the same LIOBN, this is done to simplify the cleanup and can be
> >> improved later.
> >>
> >> This advertises the new KVM_CAP_SPAPR_TCE_VFIO capability to the user
> >> space.
> >>
> >> This finally makes use of vfio_external_user_iommu_id() which was
> >> introduced quite some time ago and was considered for removal.
> >>
> >> Tests show that this patch increases transmission speed from 220MB/s
> >> to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card).
> >>
> >> Signed-off-by: Alexey Kardashevskiy 
> >> ---
> >> Changes:
> >> v3:
> >> * simplified not to use VFIO group notifiers
> >> * reworked cleanup, should be cleaner/simpler now
> >>
> >> v2:
> >> * reworked to use new VFIO notifiers
> >> * now same iommu_table may appear in the list several times, to be fixed 
> >> later
> >> ---
> >>
> >> This obsoletes:
> >>
> >> [PATCH kernel v2 08/11] KVM: PPC: Pass kvm* to kvmppc_find_table()
> >> [PATCH kernel v2 09/11] vfio iommu: Add helpers to (un)register blocking 
> >> notifiers per group
> >> [PATCH kernel v2 11/11] KVM: PPC: Add in-kernel acceleration for VFIO
> >>
> >>
> >> So I have not reposted the whole thing, should have I?
> >>
> >>
> >> btw "F: virt/kvm/vfio.*"  is missing MAINTAINERS.
> >>
> >>
> >> ---
> >>  Documentation/virtual/kvm/devices/vfio.txt |  22 ++-
> >>  arch/powerpc/include/asm/kvm_host.h|   8 +
> >>  arch/powerpc/include/asm/kvm_ppc.h |   4 +
> >>  include/uapi/linux/kvm.h   |   8 +
> >>  arch/powerpc/kvm/book3s_64_vio.c   | 286 
> >> +
> >>  arch/powerpc/kvm/book3s_64_vio_hv.c| 178 ++
> >>  arch/powerpc/kvm/powerpc.c |   2 +
> >>  virt/kvm/vfio.c|  88 +
> >>  8 files changed, 594 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/Documentation/virtual/kvm/devices/vfio.txt 
> >> b/Documentation/virtual/kvm/devices/vfio.txt
> >> index ef51740c67ca..f95d867168ea 100644
> >> --- a/Documentation/virtual/kvm/devices/vfio.txt
> >> +++ b/Documentation/virtual/kvm/devices/vfio.txt
> >> @@ -16,7 +16,25 @@ Groups:
> >>  
> >>  KVM_DEV_VFIO_GROUP attributes:
> >>KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking
> >> +  kvm_device_attr.addr points to an int32_t file descriptor
> >> +  for the VFIO group.
> >>KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device 
> >> tracking
> >> +  kvm_device_attr.addr points to an int32_t file descriptor
> >> +  for the VFIO group.
> >> +  KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table
> >> +  allocated by sPAPR KVM.
> >> +  kvm_device_attr.addr points to a struct:
> >>  
> >> -For each, kvm_device_attr.addr points to an int32_t file descriptor
> >> -for the VFIO group.
> >> +  struct kvm_vfio_spapr_tce {
> >> +  __u32   argsz;
> >> +

Re: [PATCH v2] pci: hotplug: This patch removes unnecessary return statement using spatch tool

2017-01-12 Thread Rahul Krishnan

On Thu, Jan 12, 2017 at 2:25 AM, Bjorn Helgaas  wrote:

> On Sat, Dec 24, 2016 at 03:08:00PM +0530, Rahul Krishnan wrote:
> >
> > This patch removes unnecessary return statement using spatch tool
> >
> > Signed-off-by: Rahul Krishnan 
>
> Applied to pci/hotplug for v4.11 with Tyrel's Reviewed-by, thanks!
>
> Are there other similar instances elsewhere in drivers/pci?  If so,
> can you fix them all at once?
>

Yes, I will look into it immediately. Thank you


>
> > ---
> >  drivers/pci/hotplug/rpadlpar_core.c | 4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > diff --git a/drivers/pci/hotplug/rpadlpar_core.c b/drivers/pci/hotplug/
> rpadlpar_core.c
> > index dc67f39..78ce2c7 100644
> > --- a/drivers/pci/hotplug/rpadlpar_core.c
> > +++ b/drivers/pci/hotplug/rpadlpar_core.c
> > @@ -455,7 +455,6 @@ static inline int is_dlpar_capable(void)
> >
> >  int __init rpadlpar_io_init(void)
> >  {
> > - int rc = 0;
> >
> >   if (!is_dlpar_capable()) {
> >   printk(KERN_WARNING "%s: partition not DLPAR capable\n",
> > @@ -463,8 +462,7 @@ int __init rpadlpar_io_init(void)
> >   return -EPERM;
> >   }
> >
> > - rc = dlpar_sysfs_init();
> > - return rc;
> > + return dlpar_sysfs_init();
> >  }
> >
> >  void rpadlpar_io_exit(void)
> > --
> > 2.7.4
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
We all know Linux is great... it does infinite loops in 5 seconds.
- Linus Torvald



Regards,

Rahul Krishnan
Wordpress

Re: [PATCH v4 1/2] KVM: PPC: Add new capability to control MCE behaviour

2017-01-12 Thread Paul Mackerras

On Mon, Jan 09, 2017 at 05:10:35PM +0530, Aravinda Prasad wrote:
> This patch introduces a new KVM capability to control
> how KVM behaves on machine check exception (MCE).
> Without this capability, KVM redirects machine check
> exceptions to guest's 0x200 vector, if the address in
> error belongs to the guest. With this capability KVM
> causes a guest exit with NMI exit reason.
> 
> The new capability is required to avoid problems if
> a new kernel/KVM is used with an old QEMU for guests
> that don't issue "ibm,nmi-register". As old QEMU does
> not understand the NMI exit type, it treats it as a
> fatal error. However, the guest could have handled
> the machine check error if the exception was delivered
> to guest's 0x200 interrupt vector instead of NMI exit
> in case of old QEMU.

You need to add a description of the new capability to
Documentation/virtual/kvm/api.txt.

Paul.

[PATCH v2 03/26] treewide: Consolidate set_dma_ops() implementations

2017-01-12 Thread Bart Van Assche

Now that all set_dma_ops() implementations are identical (ignoring
BUG_ON() statements), remove the architecture specific definitions
and add a definition in .

Signed-off-by: Bart Van Assche 
Cc: Benjamin Herrenschmidt 
Cc: Chris Metcalf 
Cc: David Woodhouse 
Cc: linux-a...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Paul Mackerras 
Cc: Russell King 
---
 arch/arm/include/asm/dma-mapping.h | 6 --
 arch/powerpc/include/asm/dma-mapping.h | 5 -
 arch/tile/include/asm/dma-mapping.h| 5 -
 include/linux/dma-mapping.h| 5 +
 4 files changed, 5 insertions(+), 16 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
index 312f4d0564d6..c7432d647e53 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -31,12 +31,6 @@ static inline const struct dma_map_ops *get_dma_ops(struct 
device *dev)
return __generic_dma_ops(dev);
 }
 
-static inline void set_dma_ops(struct device *dev, const struct dma_map_ops 
*ops)
-{
-   BUG_ON(!dev);
-   dev->dma_ops = ops;
-}
-
 #define HAVE_ARCH_DMA_SUPPORTED 1
 extern int dma_supported(struct device *dev, u64 mask);
 
diff --git a/arch/powerpc/include/asm/dma-mapping.h 
b/arch/powerpc/include/asm/dma-mapping.h
index 59fbd4abcbf8..8275603ba4d5 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -91,11 +91,6 @@ static inline const struct dma_map_ops *get_dma_ops(struct 
device *dev)
return dev->dma_ops;
 }
 
-static inline void set_dma_ops(struct device *dev, const struct dma_map_ops 
*ops)
-{
-   dev->dma_ops = ops;
-}
-
 /*
  * get_dma_offset()
  *
diff --git a/arch/tile/include/asm/dma-mapping.h 
b/arch/tile/include/asm/dma-mapping.h
index c0620697eaad..2562995a6ac9 100644
--- a/arch/tile/include/asm/dma-mapping.h
+++ b/arch/tile/include/asm/dma-mapping.h
@@ -59,11 +59,6 @@ static inline phys_addr_t dma_to_phys(struct device *dev, 
dma_addr_t daddr)
 
 static inline void dma_mark_clean(void *addr, size_t size) {}
 
-static inline void set_dma_ops(struct device *dev, const struct dma_map_ops 
*ops)
-{
-   dev->dma_ops = ops;
-}
-
 static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t 
size)
 {
if (!dev->dma_mask)
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index f1da68b82c63..e97f23e8b2d9 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -164,6 +164,11 @@ int dma_mmap_from_coherent(struct device *dev, struct 
vm_area_struct *vma,
 
 #ifdef CONFIG_HAS_DMA
 #include 
+static inline void set_dma_ops(struct device *dev,
+  const struct dma_map_ops *dma_ops)
+{
+   dev->dma_ops = dma_ops;
+}
 #else
 /*
  * Define the dma api to allow compilation but not linking of
-- 
2.11.0

Re: bootx_init.c:88: undefined reference to `__stack_chk_fail_local'

2017-01-12 Thread Segher Boessenkool

On Thu, Jan 12, 2017 at 09:20:47AM -0600, Benjamin Herrenschmidt wrote:
> On Thu, 2017-01-12 at 15:42 +0100, Christophe LEROY wrote:
> > The 4.6.3 uses __stack_chk_guard, while the 4.4.4 and 4.8.3 use
> > -28680(r2)
> > 
> > Is it dependent on the way GCC is built ? Then do we have a way to
> > know, 
> > when we compile, which method GCC will use ?
> > 
> > See details below for each of the 3 GCC versions.
> 
> I think it depends if you built it along with glibc (so it can produce
> userspace binaries) or not.

Right.  Tony's compilers are built using a (modified version of) buildall,
and buildall goes out of its way to build without libc whatsoever, even
if the configuration (powerpc64-linux, for example) expects one.

Which leads to TARGET_LIBC_PROVIDES_SSP being undefined (it would normally
be true for glibc >= 2.4), and that is all.  Mystery solved.  Thanks!

Segher

Re: [PATCH v5 4/5] powernv: Pass PSSCR value and mask to power9_idle_stop

2017-01-12 Thread Balbir Singh

On Tue, Jan 10, 2017 at 02:37:03PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" 
> 
> The arch300_idle_stop method currently takes only the requested stop
  power9_idle_stop (see subject :) and second paragraph)
> level as a parameter and picks up the rest of the PSSCR bits from a
> hand-coded macro. This is not a very flexible design, especially when
> the firmware has the capability to communicate the psscr value and the
> mask associated with a particular stop state via device tree.
> 
> This patch modifies the power9_idle_stop API to take as parameters the
> PSSCR value and the PSSCR mask corresponding to the stop state that
> needs to be set. These PSSCR value and mask are respectively obtained
> by parsing the "ibm,cpu-idle-state-psscr" and
> "ibm,cpu-idle-state-psscr-mask" fields from the device tree.
> 
> In addition to this, the patch adds support for handling stop states
> for which ESL and EC bits in the PSSCR are zero. As per the
> architecture, a wakeup from these stop states resumes execution from
> the subsequent instruction as opposed to waking up at the System
> Vector.
> 
> The older firmware sets only the Requested Level (RL) field in the
> psscr and psscr-mask exposed in the device tree. For older firmware
> where psscr-mask=0xf, this patch will set the default sane values that
> the set for for remaining PSSCR fields (i.e PSLL, MTL, ESL, EC, and
> TR). For the new firmware, the patch will validate that the invariants
> required by the ISA for the psscr values are maintained by the
> firmware.
> 
> This skiboot patch that exports fully populated PSSCR values and the
> mask for all the stop states can be found here:
> https://lists.ozlabs.org/pipermail/skiboot/2016-September/004869.html
> 
> [Optimize the number of instructions before entering STOP with
> ESL=EC=0, validate the PSSCR values provided by the firimware
> maintains the invariants required as per the ISA suggested by Balbir
> Singh]
> 
> Signed-off-by: Gautham R. Shenoy 
> ---

Acked-by: Balbir Singh

Re: [PATCH 01/18] powerpc/64: Don't try to use radix MMU under a hypervisor

2017-01-12 Thread Balbir Singh

On Thu, Jan 12, 2017 at 08:07:09PM +1100, Paul Mackerras wrote:
> Currently, if the kernel is running on a POWER9 processor under a
> hypervisor, it will try to use the radix MMU even though it doesn't
> have the necessary code to use radix under a hypervisor (it doesn't
> negotiate use of radix, and it doesn't do the H_REGISTER_PROC_TBL

H_REGISTER_PROCESS_TABLE
for consistency

> hcall).  The result is that the guest kernel will crash when it tries
> to turn on the MMU.
> 
> This fixes it by looking for the /chosen/ibm,architecture-vec-5
> property, and if it exists, clears the radix MMU feature bit,
> before we decide whether to initialize for radix or HPT.  This
> property is created by the hypervisor as a result of the guest
> calling the ibm,client-architecture-support method to indicate
> its capabilities, so it will indicate whether the hypervisor
> agreed to us using radix.
> 
> Systems without a hypervisor may have this property also (for
> example, skiboot creates it), so we check the HV bit in the MSR
> to see whether we are running as a guest or not.  If we are in
> hypervisor mode, then we can do whatever we like including
> using the radix MMU.
> 
> The reason for using this property is that in future, when we
> have support for using radix under a hypervisor, we will need
> to check this property to see whether the hypervisor agreed to
> us using radix.
> 
> Cc: sta...@vger.kernel.org # v4.8+
> Signed-off-by: Paul Mackerras 
> ---
>  arch/powerpc/mm/init_64.c | 33 +
>  1 file changed, 33 insertions(+)
> 
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index 93abf8a..4d9481e 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -42,6 +42,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  
>  #include 
>  #include 
> @@ -344,12 +346,43 @@ static int __init parse_disable_radix(char *p)
>  }
>  early_param("disable_radix", parse_disable_radix);
>  
> +/*
> + * If we're running under a hypervisor, we currently can't do radix
> + * since we don't have the code to do the H_REGISTER_PROC_TBL hcall.
   _PROCESS_TABLE
> + * We tell that we're running under a hypervisor by looking for the
> + * /chosen/ibm,architecture-vec-5 property.
> + */
> +static void early_check_vec5(void)
> +{

The function is called early_check, but it also disables MMU_FTR_TYPE_RADIX,
should the disabling be moved out to the caller, since the check has
nothing to do with disabling the feature?

> + unsigned long root, chosen;
> + int size;
> + const u8 *vec5;
> +
> + root = of_get_flat_dt_root();
> + chosen = of_get_flat_dt_subnode_by_name(root, "chosen");
> + if (chosen == -FDT_ERR_NOTFOUND)
> + return;
> + vec5 = of_get_flat_dt_prop(chosen, "ibm,architecture-vec-5", );
> + if (!vec5)
> + return;
> + cur_cpu_spec->mmu_features &= ~MMU_FTR_TYPE_RADIX;
> +}
> +
>  void __init mmu_early_init_devtree(void)
>  {
>   /* Disable radix mode based on kernel command line. */
>   if (disable_radix)
>   cur_cpu_spec->mmu_features &= ~MMU_FTR_TYPE_RADIX;
>  
> + /*
> +  * Check /chosen/ibm,architecture-vec-5 if running as a guest.
> +  * When running bare-metal, we can use radix if we like
> +  * even though the ibm,architecture-vec-5 property created by
> +  * skiboot doesn't have the necessary bits set.
> +  */
> + if (early_radix_enabled() && !(mfmsr() & MSR_HV))
> + early_check_vec5();

Balbir Singh.

Re: [PATCH v2 7/7] uapi: export all headers under uapi directories

2017-01-12 Thread Jan Engelhardt

On Thursday 2017-01-12 16:52, Nicolas Dichtel wrote:

>Le 09/01/2017 à 13:56, Christoph Hellwig a écrit :
>> On Fri, Jan 06, 2017 at 10:43:59AM +0100, Nicolas Dichtel wrote:
>>> Regularly, when a new header is created in include/uapi/, the developer
>>> forgets to add it in the corresponding Kbuild file. This error is usually
>>> detected after the release is out.
>>>
>>> In fact, all headers under uapi directories should be exported, thus it's
>>> useless to have an exhaustive list.
>>>
>>> After this patch, the following files, which were not exported, are now
>>> exported (with make headers_install_all):
>> 
>> ... snip ...
>> 
>>> linux/genwqe/.install
>>> linux/genwqe/..install.cmd
>>> linux/cifs/.install
>>> linux/cifs/..install.cmd
>> 
>> I'm pretty sure these should not be exported!
>> 
>Those files are created in every directory:
>$ find usr/include/ -name '\.\.install.cmd' | wc -l
>71

That still does not mean they should be exported.

Anything but headers (and directories as a skeleton structure) is maximally 
suspicious.

Re: [PATCH v2 7/7] uapi: export all headers under uapi directories

2017-01-12 Thread Nicolas Dichtel

Le 12/01/2017 à 17:28, Jan Engelhardt a écrit :
> On Thursday 2017-01-12 16:52, Nicolas Dichtel wrote:
> 
>> Le 09/01/2017 à 13:56, Christoph Hellwig a écrit :
>>> On Fri, Jan 06, 2017 at 10:43:59AM +0100, Nicolas Dichtel wrote:
 Regularly, when a new header is created in include/uapi/, the developer
 forgets to add it in the corresponding Kbuild file. This error is usually
 detected after the release is out.

 In fact, all headers under uapi directories should be exported, thus it's
 useless to have an exhaustive list.

 After this patch, the following files, which were not exported, are now
 exported (with make headers_install_all):
>>>
>>> ... snip ...
>>>
 linux/genwqe/.install
 linux/genwqe/..install.cmd
 linux/cifs/.install
 linux/cifs/..install.cmd
>>>
>>> I'm pretty sure these should not be exported!
>>>
>> Those files are created in every directory:
>> $ find usr/include/ -name '\.\.install.cmd' | wc -l
>> 71
> 
> That still does not mean they should be exported.
> 
> Anything but headers (and directories as a skeleton structure) is maximally 
> suspicious.
> 
What I was trying to say is that I export those directories like other are.
Removing those files is not related to that series.


Regards,
Nicolas

Re: [PATCH v2 7/7] uapi: export all headers under uapi directories

2017-01-12 Thread Nicolas Dichtel

Le 09/01/2017 à 13:56, Christoph Hellwig a écrit :
> On Fri, Jan 06, 2017 at 10:43:59AM +0100, Nicolas Dichtel wrote:
>> Regularly, when a new header is created in include/uapi/, the developer
>> forgets to add it in the corresponding Kbuild file. This error is usually
>> detected after the release is out.
>>
>> In fact, all headers under uapi directories should be exported, thus it's
>> useless to have an exhaustive list.
>>
>> After this patch, the following files, which were not exported, are now
>> exported (with make headers_install_all):
> 
> ... snip ...
> 
>> linux/genwqe/.install
>> linux/genwqe/..install.cmd
>> linux/cifs/.install
>> linux/cifs/..install.cmd
> 
> I'm pretty sure these should not be exported!
> 
Those files are created in every directory:
$ find usr/include/ -name '\.\.install.cmd' | wc -l
71
$ find usr/include/ -name '\.install' | wc -l
71

See also
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/scripts/Makefile.headersinst#n32


Thank you,
Nicolas

Re: bootx_init.c:88: undefined reference to `__stack_chk_fail_local'

2017-01-12 Thread Benjamin Herrenschmidt

On Thu, 2017-01-12 at 15:42 +0100, Christophe LEROY wrote:
> The 4.6.3 uses __stack_chk_guard, while the 4.4.4 and 4.8.3 use
> -28680(r2)
> 
> Is it dependent on the way GCC is built ? Then do we have a way to
> know, 
> when we compile, which method GCC will use ?
> 
> See details below for each of the 3 GCC versions.

I think it depends if you built it along with glibc (so it can produce
userspace binaries) or not.

Cheers,
Ben.

[PATCH -next] powerpc/pseries: Fix typo in parameter description

2017-01-12 Thread Wei Yongjun

From: Wei Yongjun 

Fix typo in parameter description.

Signed-off-by: Wei Yongjun 
---
 arch/powerpc/platforms/pseries/cmm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/cmm.c 
b/arch/powerpc/platforms/pseries/cmm.c
index 66e7227..7af18da 100644
--- a/arch/powerpc/platforms/pseries/cmm.c
+++ b/arch/powerpc/platforms/pseries/cmm.c
@@ -74,7 +74,7 @@ module_param_named(delay, delay, uint, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(delay, "Delay (in seconds) between polls to query hypervisor 
paging requests. "
 "[Default=" __stringify(CMM_DEFAULT_DELAY) "]");
 module_param_named(hotplug_delay, hotplug_delay, uint, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(delay, "Delay (in seconds) after memory hotplug remove "
+MODULE_PARM_DESC(hotplug_delay, "Delay (in seconds) after memory hotplug 
remove "
 "before loaning resumes. "
 "[Default=" __stringify(CMM_HOTPLUG_DELAY) "]");
 module_param_named(oom_kb, oom_kb, uint, S_IRUGO | S_IWUSR);

Re: bootx_init.c:88: undefined reference to `__stack_chk_fail_local'

2017-01-12 Thread Christophe LEROY


Le 12/01/2017 à 08:52, Christophe LEROY a écrit :



Le 11/01/2017 à 23:54, Segher Boessenkool a écrit :

On Tue, Jan 10, 2017 at 07:26:15AM +0100, Christophe LEROY wrote:

Maybe ppc32 is not supposed to be built with CC_STACKPROTECTOR ?


Indeed, the latest versions of GCC don't use anymore the global variable
__stack_chk_guard as canary value, but a value stored at -0x7008(r2).
This is not compatible with the current implementation of  the kernel
with uses r2 as a pointeur to current task struct.
So until we fix it, I don't think CC_STACKPROTECTOR is usable on PPC
with modern versions of GCC.


I still wonder what changed.  Nothing relevant has changed for ten years
or whatever as far as I see; unless it is just the
-fstack-protector-strong
that makes it fail now.  Curious.



Yes, looks like it was changed from global to TLS in 2005 on powerpc.
Indeed when I implemented STACKPROTECTOR in Kernel on ppc I
copied/pasted it from ARM which is (still?) using the global
__stack_chk_guard, and at first it worked quite well on my powerpc.

x86 has the following option on GCC. Couldn't we have it on powerpc too ?

-mstack-protector-guard=guard
Generate  stack  protection  code  using  canary  at
guard.   Supported  locations are ‘ global ’ for global canary or ‘ tls
’ for per-thread canary in the TLS block (the  default). This  option
has  effect  only  when  ‘-fstack-protector’  or ‘-fstack-protector-all’
is specified.



Finally, it looks like it is not so easy.

I have three instances of GCC:
* 4.4.4, home built
* 4.6.3, from https://www.kernel.org/pub/tools/crosstool/
* 4.8.3, home built

The 4.6.3 uses __stack_chk_guard, while the 4.4.4 and 4.8.3 use -28680(r2)

Is it dependent on the way GCC is built ? Then do we have a way to know, 
when we compile, which method GCC will use ?


See details below for each of the 3 GCC versions.

Christophe

Using built-in specs.
Target: ppc-linux
Configured with: /root/cldk/gcc-4.4.4/configure --target=ppc-linux 
--with-headers=yes --with-cpu=860 --prefix=/opt/cldk 
--bindir=/opt/cldk/bin --sbindir=/opt/cldk/sbin 
--libexecdir=/opt/cldk/libexec --datadir=/opt/cldk/share 
--sysconfdir=/opt/cldk/etc --libdir=/opt/cldk/lib 
--includedir=/opt/cldk/usr/include --oldincludedir=/opt/cldk/usr/include 
--infodir=/opt/cldk/share/info --mandir=/opt/cldk/share/man 
--enable-languages=c,c++

Thread model: posix
gcc version 4.4.4 (GCC)

007c :
  7c:   7c 08 02 a6 mflrr0
  80:   94 21 ff a0 stwur1,-96(r1)
  84:   3c 80 00 00 lis r4,0
86: R_PPC_ADDR16_HA .rodata.str1.4+0x1bc
  88:   93 c1 00 58 stw r30,88(r1)
  8c:   93 e1 00 5c stw r31,92(r1)
  90:   90 01 00 64 stw r0,100(r1)
  94:   93 81 00 50 stw r28,80(r1)
  98:   93 a1 00 54 stw r29,84(r1)
  9c:   38 84 01 bc addir4,r4,444
9e: R_PPC_ADDR16_LO .rodata.str1.4+0x1bc
  a0:   38 a0 00 09 li  r5,9
  a4:   80 02 8f f8 lwz r0,-28680(r2)
  a8:   90 01 00 4c stw r0,76(r1)

[...]

  fc:   80 01 00 4c lwz r0,76(r1)
 100:   81 22 8f f8 lwz r9,-28680(r2)
 104:   7c 00 4a 79 xor.r0,r0,r9
 108:   39 20 00 00 li  r9,0
 10c:   7f a3 eb 78 mr  r3,r29
 110:   40 82 03 88 bne-498 

[...]

 498:   48 00 00 01 bl  498 
498: R_PPC_REL24__stack_chk_fail


Using built-in specs.
COLLECT_GCC=powerpc64-linux-gcc
COLLECT_LTO_WRAPPER=/opt/gcc-4.6.3-nolibc/powerpc64-linux/bin/../libexec/gcc/powerpc64-linux/4.6.3/lto-wrapper
Target: powerpc64-linux
Configured with: /home/tony/buildall/src/gcc/configure 
--target=powerpc64-linux --host=i686-linux-gnu --build=i686-linux-gnu 
--enable-targets=all 
--prefix=/opt/cross/gcc-4.6.3-nolibc/powerpc64-linux/ 
--enable-languages=c --with-newlib --without-headers 
--enable-sjlj-exceptions --with-system-libunwind --disable-nls 
--disable-threads --disable-shared --disable-libmudflap --disable-libssp 
--disable-libgomp --disable-decimal-float --enable-checking=release 
--with-mpfr=/home/tony/buildall/src/sys-i686 
--with-gmp=/home/tony/buildall/src/sys-i686 --disable-bootstrap 
--disable-libquadmath

Thread model: single
gcc version 4.6.3 (GCC)

00c0 :
  c0:   94 21 ff a0 stwur1,-96(r1)
  c4:   7c 08 02 a6 mflrr0
  c8:   3c 80 00 00 lis r4,0
ca: R_PPC_ADDR16_HA .rodata.str1.4+0x50
  cc:   38 a0 00 09 li  r5,9
  d0:   38 84 00 50 addir4,r4,80
d2: R_PPC_ADDR16_LO .rodata.str1.4+0x50
  d4:   bf 81 00 50 stmwr28,80(r1)
  d8:   3f e0 00 00 lis r31,0
da: R_PPC_ADDR16_HA __stack_chk_guard
  dc:   7c 7e 1b 78 mr  r30,r3
  e0:   90 01 00 64 stw r0,100(r1)
  e4:   3b ff 00 00 addir31,r31,0
e6: R_PPC_ADDR16_LO __stack_chk_guard
  e8:   80 1f 00 00 lwz

[RFT PATCH 08/37] PCI: dwc: Split struct pcie_port into host only and core structures

2017-01-12 Thread Kishon Vijay Abraham I

Keep only the host specific members in *struct pcie_port* and
move the common members (i.e common to both host and endpoint)
to *struct dw_pcie*. This is in preparation for adding endpoint
mode support to designware driver.

While at that also fix checkpatch warnings.

Cc: Jingoo Han 
Cc: Richard Zhu 
Cc: Lucas Stach 
Cc: Murali Karicheri 
Cc: Minghuan Lian 
Cc: Mingkai Hu 
Cc: Roy Zang 
Cc: Thomas Petazzoni 
Cc: Niklas Cassel 
Cc: Jesper Nilsson 
Cc: Joao Pinto 
Cc: Zhou Wang 
Cc: Gabriele Paoloni 
Cc: Stanimir Varbanov 
Cc: Pratyush Anand 
Signed-off-by: Kishon Vijay Abraham I 
---
I've pushed the series to
git://git.ti.com/linux-phy/linux-phy.git pci_ep_v1

I have access to only dra7 based boards, so I was able to test only
that. Testing in other plaforms would be highly appreciated.

 drivers/pci/dwc/pci-dra7xx.c   |   76 +++-
 drivers/pci/dwc/pci-exynos.c   |   78 +++-
 drivers/pci/dwc/pci-imx6.c |  128 ++--
 drivers/pci/dwc/pci-keystone-dw.c  |   83 +++--
 drivers/pci/dwc/pci-keystone.c |   54 +
 drivers/pci/dwc/pci-keystone.h |4 +-
 drivers/pci/dwc/pci-layerscape.c   |   91 +-
 drivers/pci/dwc/pcie-armada8k.c|   85 +++--
 drivers/pci/dwc/pcie-artpec6.c |   48 
 drivers/pci/dwc/pcie-designware-plat.c |   27 +++--
 drivers/pci/dwc/pcie-designware.c  |  203 +---
 drivers/pci/dwc/pcie-designware.h  |   69 ++-
 drivers/pci/dwc/pcie-hisi.c|   55 +
 drivers/pci/dwc/pcie-qcom.c|   70 +++
 drivers/pci/dwc/pcie-spear13xx.c   |   74 +++-
 15 files changed, 665 insertions(+), 480 deletions(-)

diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index 38b0c9a..3c525b0 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -67,7 +67,7 @@
 #define EXP_CAP_ID_OFFSET  0x70
 
 struct dra7xx_pcie {
-   struct pcie_portpp;
+   struct dw_pcie  *pci;
void __iomem*base;  /* DT ti_conf */
int phy_count;  /* DT phy-names count */
struct phy  **phy;
@@ -75,7 +75,7 @@ struct dra7xx_pcie {
struct irq_domain   *irq_domain;
 };
 
-#define to_dra7xx_pcie(x)  container_of((x), struct dra7xx_pcie, pp)
+#define to_dra7xx_pcie(x)  dev_get_drvdata((x)->dev)
 
 static inline u32 dra7xx_pcie_readl(struct dra7xx_pcie *pcie, u32 offset)
 {
@@ -93,9 +93,9 @@ static u64 dra7xx_pcie_cpu_addr_fixup(u64 pci_addr)
return pci_addr & DRA7XX_CPU_TO_BUS_ADDR;
 }
 
-static int dra7xx_pcie_link_up(struct pcie_port *pp)
+static int dra7xx_pcie_link_up(struct dw_pcie *pci)
 {
-   struct dra7xx_pcie *dra7xx = to_dra7xx_pcie(pp);
+   struct dra7xx_pcie *dra7xx = to_dra7xx_pcie(pci);
u32 reg = dra7xx_pcie_readl(dra7xx, PCIECTRL_DRA7XX_CONF_PHY_CS);
 
return !!(reg & LINK_UP);
@@ -103,32 +103,32 @@ static int dra7xx_pcie_link_up(struct pcie_port *pp)
 
 static int dra7xx_pcie_establish_link(struct dra7xx_pcie *dra7xx)
 {
-   struct pcie_port *pp = >pp;
-   struct device *dev = pp->dev;
+   struct dw_pcie *pci = dra7xx->pci;
+   struct device *dev = pci->dev;
u32 reg;
u32 exp_cap_off = EXP_CAP_ID_OFFSET;
 
-   if (dw_pcie_link_up(pp)) {
+   if (dw_pcie_link_up(pci)) {
dev_err(dev, "link is already up\n");
return 0;
}
 
if (dra7xx->link_gen == 1) {
-   dw_pcie_read(pp->dbi_base + exp_cap_off + PCI_EXP_LNKCAP,
+   dw_pcie_read(pci->dbi_base + exp_cap_off + PCI_EXP_LNKCAP,
 4, );
if ((reg & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
reg &= ~((u32)PCI_EXP_LNKCAP_SLS);
reg |= PCI_EXP_LNKCAP_SLS_2_5GB;
-   dw_pcie_write(pp->dbi_base + exp_cap_off +
+   dw_pcie_write(pci->dbi_base + exp_cap_off +
  PCI_EXP_LNKCAP, 4, reg);
}
 
-   dw_pcie_read(pp->dbi_base + exp_cap_off + PCI_EXP_LNKCTL2,
+   dw_pcie_read(pci->dbi_base + exp_cap_off + PCI_EXP_LNKCTL2,
 2, );
if ((reg & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
reg &= ~((u32)PCI_EXP_LNKCAP_SLS);
reg |= PCI_EXP_LNKCAP_SLS_2_5GB;
-

[PATCH 14/37] PCI: endpoint: Add EP core layer to enable EP controller and EP functions

2017-01-12 Thread Kishon Vijay Abraham I

Introduce a new EP core layer in order to support endpoint functions
in linux kernel. This comprises of EPC library
(Endpoint Controller Library) and EPF library (Endpoint
Function Library). EPC library implements functions that is specific
to an endpoint controller and EPF library implements functions
that is specific to an endpoint function.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/Makefile|2 +
 drivers/pci/Kconfig |1 +
 drivers/pci/endpoint/Kconfig|   21 ++
 drivers/pci/endpoint/Makefile   |6 +
 drivers/pci/endpoint/pci-epc-core.c |  548 +++
 drivers/pci/endpoint/pci-epc-mem.c  |  143 +
 drivers/pci/endpoint/pci-epf-core.c |  347 ++
 include/linux/mod_devicetable.h |   10 +
 include/linux/pci-epc.h |  141 +
 include/linux/pci-epf.h |  160 ++
 10 files changed, 1379 insertions(+)
 create mode 100644 drivers/pci/endpoint/Kconfig
 create mode 100644 drivers/pci/endpoint/Makefile
 create mode 100644 drivers/pci/endpoint/pci-epc-core.c
 create mode 100644 drivers/pci/endpoint/pci-epc-mem.c
 create mode 100644 drivers/pci/endpoint/pci-epf-core.c
 create mode 100644 include/linux/pci-epc.h
 create mode 100644 include/linux/pci-epf.h

diff --git a/drivers/Makefile b/drivers/Makefile
index f521cb0..a300bb1 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -14,7 +14,9 @@ obj-$(CONFIG_GENERIC_PHY) += phy/
 obj-$(CONFIG_PINCTRL)  += pinctrl/
 obj-$(CONFIG_GPIOLIB)  += gpio/
 obj-y  += pwm/
+
 obj-$(CONFIG_PCI)  += pci/
+obj-$(CONFIG_PCI_ENDPOINT) += pci/endpoint/
 # PCI dwc controller drivers
 obj-y  += pci/dwc/
 
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index df14142..9747c1e 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -134,3 +134,4 @@ config PCI_HYPERV
 source "drivers/pci/hotplug/Kconfig"
 source "drivers/pci/dwc/Kconfig"
 source "drivers/pci/host/Kconfig"
+source "drivers/pci/endpoint/Kconfig"
diff --git a/drivers/pci/endpoint/Kconfig b/drivers/pci/endpoint/Kconfig
new file mode 100644
index 000..7eb1c79
--- /dev/null
+++ b/drivers/pci/endpoint/Kconfig
@@ -0,0 +1,21 @@
+#
+# PCI Endpoint Support
+#
+
+menu "PCI Endpoint"
+
+config PCI_ENDPOINT
+   bool "PCI Endpoint Support"
+   select CONFIGFS_FS
+   help
+  Enable this configuration option to support configurable PCI
+  endpoint. This should be enabled if the platform has a PCI
+  controller that can operate in endpoint mode.
+
+  Enabling this option will build the endpoint library, which
+  includes endpoint controller library and endpoint function
+  library.
+
+  If in doubt, say "N" to disable Endpoint support.
+
+endmenu
diff --git a/drivers/pci/endpoint/Makefile b/drivers/pci/endpoint/Makefile
new file mode 100644
index 000..eeef1b7
--- /dev/null
+++ b/drivers/pci/endpoint/Makefile
@@ -0,0 +1,6 @@
+#
+# Makefile for PCI Endpoint Support
+#
+
+obj-$(CONFIG_PCI_ENDPOINT) := pci-epc-core.o pci-epf-core.o\
+  pci-epc-mem.o
diff --git a/drivers/pci/endpoint/pci-epc-core.c 
b/drivers/pci/endpoint/pci-epc-core.c
new file mode 100644
index 000..2c33e8a
--- /dev/null
+++ b/drivers/pci/endpoint/pci-epc-core.c
@@ -0,0 +1,548 @@
+/**
+ * PCI Endpoint *Controller* (EPC) library
+ *
+ * Copyright (C) 2017 Texas Instruments
+ * Author: Kishon Vijay Abraham I 
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 of
+ * the License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+static struct class *pci_epc_class;
+
+static void devm_pci_epc_release(struct device *dev, void *res)
+{
+   struct pci_epc *epc = *(struct pci_epc **)res;
+
+   pci_epc_destroy(epc);
+}
+
+static int devm_pci_epc_match(struct device *dev, void *res, void *match_data)
+{
+   struct pci_epc **epc = res;
+
+   return *epc == match_data;
+}
+
+/**
+ * pci_epc_get() - get the pci endpoint controller
+ * @epc_name: device name of the endpoint controller
+ *
+ * Invoke to get struct pci_epc * corresponding to the device name of the
+ * endpoint controller
+ */
+struct pci_epc *pci_epc_get(char *epc_name)
+{
+   int ret = -EINVAL;
+   struct pci_epc *epc;
+

[PATCH 24/37] PCI: dwc: designware: Add EP mode support

2017-01-12 Thread Kishon Vijay Abraham I

Add endpoint mode support to designware driver. This uses the
EP Core layer introduced recently to add endpoint mode support.
*Any* function driver can now use this designware device
in order to achieve the EP functionality.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/Kconfig  |5 +
 drivers/pci/dwc/Makefile |1 +
 drivers/pci/dwc/pcie-designware-ep.c |  342 ++
 drivers/pci/dwc/pcie-designware.c|   51 +
 drivers/pci/dwc/pcie-designware.h|   70 +++
 5 files changed, 469 insertions(+)
 create mode 100644 drivers/pci/dwc/pcie-designware-ep.c

diff --git a/drivers/pci/dwc/Kconfig b/drivers/pci/dwc/Kconfig
index bee8b52..4cb1ba0 100644
--- a/drivers/pci/dwc/Kconfig
+++ b/drivers/pci/dwc/Kconfig
@@ -9,6 +9,11 @@ config PCIE_DW_HOST
depends on PCI_MSI_IRQ_DOMAIN
 select PCIE_DW
 
+config PCIE_DW_EP
+   bool
+   depends on PCI_ENDPOINT
+   select PCIE_DW
+
 config PCI_DRA7XX
bool "TI DRA7xx PCIe controller"
depends on PCI
diff --git a/drivers/pci/dwc/Makefile b/drivers/pci/dwc/Makefile
index a2df13c..b38425d 100644
--- a/drivers/pci/dwc/Makefile
+++ b/drivers/pci/dwc/Makefile
@@ -1,5 +1,6 @@
 obj-$(CONFIG_PCIE_DW) += pcie-designware.o
 obj-$(CONFIG_PCIE_DW_HOST) += pcie-designware-host.o
+obj-$(CONFIG_PCIE_DW_EP) += pcie-designware-ep.o
 obj-$(CONFIG_PCIE_DW_PLAT) += pcie-designware-plat.o
 obj-$(CONFIG_PCI_DRA7XX) += pci-dra7xx.o
 obj-$(CONFIG_PCI_EXYNOS) += pci-exynos.o
diff --git a/drivers/pci/dwc/pcie-designware-ep.c 
b/drivers/pci/dwc/pcie-designware-ep.c
new file mode 100644
index 000..e465c5e
--- /dev/null
+++ b/drivers/pci/dwc/pcie-designware-ep.c
@@ -0,0 +1,342 @@
+/**
+ * Synopsys Designware PCIe Endpoint controller driver
+ *
+ * Copyright (C) 2017 Texas Instruments
+ * Author: Kishon Vijay Abraham I 
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 of
+ * the License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+
+#include "pcie-designware.h"
+#include 
+#include 
+
+void dw_pcie_ep_linkup(struct dw_pcie_ep *ep)
+{
+   struct pci_epc *epc = ep->epc;
+   struct pci_epf *epf;
+
+   list_for_each_entry(epf, >pci_epf, list)
+   pci_epf_linkup(epf);
+}
+
+static void dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno bar)
+{
+   u32 reg;
+
+   reg = PCI_BASE_ADDRESS_0 + (4 * bar);
+   dw_pcie_write_dbi(pci, pci->dbi_base2, reg, 0x4, 0x0);
+   dw_pcie_write_dbi(pci, pci->dbi_base, reg, 0x4, 0x0);
+}
+
+static int dw_pcie_ep_write_header(struct pci_epc *epc,
+  struct pci_epf_header *hdr)
+{
+   struct dw_pcie_ep *ep = epc_get_drvdata(epc);
+   struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+   void __iomem *base = pci->dbi_base;
+
+   dw_pcie_write_dbi(pci, base, PCI_VENDOR_ID, 0x2, hdr->vendorid);
+   dw_pcie_write_dbi(pci, base, PCI_DEVICE_ID, 0x2, hdr->deviceid);
+   dw_pcie_write_dbi(pci, base, PCI_REVISION_ID, 0x1, hdr->revid);
+   dw_pcie_write_dbi(pci, base, PCI_CLASS_PROG, 0x1, hdr->progif_code);
+   dw_pcie_write_dbi(pci, base, PCI_CLASS_DEVICE, 0x2,
+ hdr->subclass_code | hdr->baseclass_code << 8);
+   dw_pcie_write_dbi(pci, base, PCI_CACHE_LINE_SIZE, 0x1,
+ hdr->cache_line_size);
+   dw_pcie_write_dbi(pci, base, PCI_SUBSYSTEM_VENDOR_ID, 0x2,
+ hdr->subsys_vendor_id);
+   dw_pcie_write_dbi(pci, base, PCI_SUBSYSTEM_ID, 0x2, hdr->subsys_id);
+   dw_pcie_write_dbi(pci, base, PCI_INTERRUPT_PIN, 0x1,
+ hdr->interrupt_pin);
+
+   return 0;
+}
+
+static int dw_pcie_ep_inbound_atu(struct dw_pcie_ep *ep, enum pci_barno bar,
+ dma_addr_t cpu_addr,
+ enum dw_pcie_as_type as_type)
+{
+   int ret;
+   u32 free_win;
+   struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+
+   free_win = find_first_zero_bit(>ib_window_map,
+  sizeof(ep->ib_window_map));
+   if (free_win >= ep->num_ib_windows) {
+   dev_err(pci->dev, "no free inbound window\n");
+   return -EINVAL;
+   }
+
+   ret = dw_pcie_prog_inbound_atu(pci, free_win, bar, cpu_addr,
+  as_type);
+   if (ret < 0) {
+   dev_err(pci->dev, "Failed to program IB

[PATCH 10/37] PCI: dwc: designware: Fix style errors in pcie-designware.c

2017-01-12 Thread Kishon Vijay Abraham I

No functional change. Fix all checkpatch warnings and check errors
in pcie-designware.c

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pcie-designware.c |   42 ++---
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/drivers/pci/dwc/pcie-designware.c 
b/drivers/pci/dwc/pcie-designware.c
index 89cdb6b..ff04074 100644
--- a/drivers/pci/dwc/pcie-designware.c
+++ b/drivers/pci/dwc/pcie-designware.c
@@ -40,13 +40,13 @@ int dw_pcie_read(void __iomem *addr, int size, u32 *val)
return PCIBIOS_BAD_REGISTER_NUMBER;
}
 
-   if (size == 4)
+   if (size == 4) {
*val = readl(addr);
-   else if (size == 2)
+   } else if (size == 2) {
*val = readw(addr);
-   else if (size == 1)
+   } else if (size == 1) {
*val = readb(addr);
-   else {
+   } else {
*val = 0;
return PCIBIOS_BAD_REGISTER_NUMBER;
}
@@ -203,16 +203,15 @@ irqreturn_t dw_handle_msi_irq(struct pcie_port *pp)
 
for (i = 0; i < MAX_MSI_CTRLS; i++) {
dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_STATUS + i * 12, 4,
-   (u32 *));
+   (u32 *));
if (val) {
ret = IRQ_HANDLED;
pos = 0;
while ((pos = find_next_bit(, 32, pos)) != 32) {
irq = irq_find_mapping(pp->irq_domain,
-   i * 32 + pos);
-   dw_pcie_wr_own_conf(pp,
-   PCIE_MSI_INTR0_STATUS + i * 12,
-   4, 1 << pos);
+  i * 32 + pos);
+   dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS +
+   i * 12, 4, 1 << pos);
generic_handle_irq(irq);
pos++;
}
@@ -278,8 +277,9 @@ static void dw_pcie_msi_set_irq(struct pcie_port *pp, int 
irq)
 static int assign_irq(int no_irqs, struct msi_desc *desc, int *pos)
 {
int irq, pos0, i;
-   struct pcie_port *pp = (struct pcie_port *) 
msi_desc_to_pci_sysdata(desc);
+   struct pcie_port *pp;
 
+   pp  = (struct pcie_port *)msi_desc_to_pci_sysdata(desc);
pos0 = bitmap_find_free_region(pp->msi_irq_in_use, MAX_MSI_IRQS,
   order_base_2(no_irqs));
if (pos0 < 0)
@@ -341,7 +341,7 @@ static void dw_msi_setup_msg(struct pcie_port *pp, unsigned 
int irq, u32 pos)
 }
 
 static int dw_msi_setup_irq(struct msi_controller *chip, struct pci_dev *pdev,
-   struct msi_desc *desc)
+   struct msi_desc *desc)
 {
int irq, pos;
struct pcie_port *pp = pdev->bus->sysdata;
@@ -389,7 +389,7 @@ static void dw_msi_teardown_irq(struct msi_controller 
*chip, unsigned int irq)
 {
struct irq_data *data = irq_get_irq_data(irq);
struct msi_desc *msi = irq_data_get_msi_desc(data);
-   struct pcie_port *pp = (struct pcie_port *) 
msi_desc_to_pci_sysdata(msi);
+   struct pcie_port *pp = (struct pcie_port *)msi_desc_to_pci_sysdata(msi);
 
clear_irq_range(pp, irq, 1, data->hwirq);
 }
@@ -431,7 +431,7 @@ int dw_pcie_link_up(struct dw_pcie *pci)
 }
 
 static int dw_pcie_msi_map(struct irq_domain *domain, unsigned int irq,
-   irq_hw_number_t hwirq)
+  irq_hw_number_t hwirq)
 {
irq_set_chip_and_handler(irq, _msi_irq_chip, handle_simple_irq);
irq_set_chip_data(irq, domain->host_data);
@@ -468,8 +468,8 @@ int dw_pcie_host_init(struct pcie_port *pp)
 
cfg_res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "config");
if (cfg_res) {
-   pp->cfg0_size = resource_size(cfg_res)/2;
-   pp->cfg1_size = resource_size(cfg_res)/2;
+   pp->cfg0_size = resource_size(cfg_res) / 2;
+   pp->cfg1_size = resource_size(cfg_res) / 2;
pp->cfg0_base = cfg_res->start;
pp->cfg1_base = cfg_res->start + pp->cfg0_size;
} else if (!pp->va_cfg0_base) {
@@ -508,8 +508,8 @@ int dw_pcie_host_init(struct pcie_port *pp)
break;
case 0:
pp->cfg = win->res;
-   pp->cfg0_size = resource_size(pp->cfg)/2;
-   pp->cfg1_size = resource_size(pp->cfg)/2;
+   pp->cfg0_size = resource_size(pp->cfg) / 2;
+   pp->cfg1_size = resource_size(pp->cfg) / 2;
pp->cfg0_base = pp->cfg->start;
pp->cfg1_base = pp->cfg->start + pp->cfg0_size;
break;
@@ -615,7 +615,7 @@ int

[PATCH 26/37] PCI: dwc: dra7xx: Facilitate wrapper and msi interrupts to be enabled independently

2017-01-12 Thread Kishon Vijay Abraham I

No functional change. Split dra7xx_pcie_enable_interrupts into
dra7xx_pcie_enable_wrapper_interrupts and dra7xx_pcie_enable_msi_interrupts
so that wrapper interrupts and msi interrupts can be enabled independently.
This is in preparation for adding EP mode support to dra7xx driver since
EP mode doesn't have to enable msi_interrupts.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pci-dra7xx.c |   24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index 8a1fccd..eb3a9c6 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -140,18 +140,30 @@ static int dra7xx_pcie_establish_link(struct dra7xx_pcie 
*dra7xx)
return dw_pcie_wait_for_link(pci);
 }
 
-static void dra7xx_pcie_enable_interrupts(struct dra7xx_pcie *dra7xx)
+static void dra7xx_pcie_enable_msi_interrupts(struct dra7xx_pcie *dra7xx)
 {
-   dra7xx_pcie_writel(dra7xx, PCIECTRL_DRA7XX_CONF_IRQSTATUS_MAIN,
-  ~INTERRUPTS);
-   dra7xx_pcie_writel(dra7xx,
-  PCIECTRL_DRA7XX_CONF_IRQENABLE_SET_MAIN, INTERRUPTS);
dra7xx_pcie_writel(dra7xx, PCIECTRL_DRA7XX_CONF_IRQSTATUS_MSI,
   ~LEG_EP_INTERRUPTS & ~MSI);
-   dra7xx_pcie_writel(dra7xx, PCIECTRL_DRA7XX_CONF_IRQENABLE_SET_MSI,
+
+   dra7xx_pcie_writel(dra7xx,
+  PCIECTRL_DRA7XX_CONF_IRQENABLE_SET_MSI,
   MSI | LEG_EP_INTERRUPTS);
 }
 
+static void dra7xx_pcie_enable_wrapper_interrupts(struct dra7xx_pcie *dra7xx)
+{
+   dra7xx_pcie_writel(dra7xx, PCIECTRL_DRA7XX_CONF_IRQSTATUS_MAIN,
+  ~INTERRUPTS);
+   dra7xx_pcie_writel(dra7xx, PCIECTRL_DRA7XX_CONF_IRQENABLE_SET_MAIN,
+  INTERRUPTS);
+}
+
+static void dra7xx_pcie_enable_interrupts(struct dra7xx_pcie *dra7xx)
+{
+   dra7xx_pcie_enable_wrapper_interrupts(dra7xx);
+   dra7xx_pcie_enable_msi_interrupts(dra7xx);
+}
+
 static void dra7xx_pcie_host_init(struct pcie_port *pp)
 {
struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
-- 
1.7.9.5

[PATCH 01/37] PCI: dwc: dra7xx: Group all host related setup in add_pcie_port

2017-01-12 Thread Kishon Vijay Abraham I

commit 150645b94348 ("PCI: dra7xx: Move struct pcie_port
setup to probe function") moved host related setup to the probe
function. However instead of cluttering the probe function with
host related setup, group all host related setup in add_pcie_port
function. This way when endpoint support is added, all the
endpoint related setup can be added in a separate function.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pci-dra7xx.c |   13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index ec5617a..fb37e09 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -288,9 +288,13 @@ static int __init dra7xx_add_pcie_port(struct dra7xx_pcie 
*dra7xx,
   struct platform_device *pdev)
 {
int ret;
-   struct pcie_port *pp = >pp;
-   struct device *dev = pp->dev;
+   struct pcie_port *pp;
struct resource *res;
+   struct device *dev = >dev;
+
+   pp = >pp;
+   pp->dev = dev;
+   pp->ops = _pcie_host_ops;
 
pp->irq = platform_get_irq(pdev, 1);
if (pp->irq < 0) {
@@ -374,7 +378,6 @@ static int __init dra7xx_pcie_probe(struct platform_device 
*pdev)
void __iomem *base;
struct resource *res;
struct dra7xx_pcie *dra7xx;
-   struct pcie_port *pp;
struct device *dev = >dev;
struct device_node *np = dev->of_node;
char name[10];
@@ -384,10 +387,6 @@ static int __init dra7xx_pcie_probe(struct platform_device 
*pdev)
if (!dra7xx)
return -ENOMEM;
 
-   pp = >pp;
-   pp->dev = dev;
-   pp->ops = _pcie_host_ops;
-
irq = platform_get_irq(pdev, 0);
if (irq < 0) {
dev_err(dev, "missing IRQ resource\n");
-- 
1.7.9.5

[PATCH 20/37] Documentation: PCI: Add binding documentation for pci-test endpoint function

2017-01-12 Thread Kishon Vijay Abraham I

Add binding documentation for pci-test endpoint function that helps in
adding and configuring pci-test endpoint function.

Signed-off-by: Kishon Vijay Abraham I 
---
 Documentation/PCI/00-INDEX |2 ++
 .../PCI/endpoint/function/binding/pci-test.txt |   17 +
 2 files changed, 19 insertions(+)
 create mode 100644 Documentation/PCI/endpoint/function/binding/pci-test.txt

diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
index 4e5a283..53717b7 100644
--- a/Documentation/PCI/00-INDEX
+++ b/Documentation/PCI/00-INDEX
@@ -18,3 +18,5 @@ endpoint/pci-endpoint-cfs.txt
- guide to use configfs to configure the pci endpoint function.
 endpoint/pci-test-function.txt
- specification of *pci test* function device.
+endpoint/function/binding/
+   - binding documentation for pci endpoint function
diff --git a/Documentation/PCI/endpoint/function/binding/pci-test.txt 
b/Documentation/PCI/endpoint/function/binding/pci-test.txt
new file mode 100644
index 000..7358240
--- /dev/null
+++ b/Documentation/PCI/endpoint/function/binding/pci-test.txt
@@ -0,0 +1,17 @@
+PCI TEST ENDPOINT FUNCTION
+
+name: Should be "pci_epf_test" to bind to the pci_epf_test driver.
+
+Configurable Fields:
+vendorid: should be 0x104c
+deviceid: should be 0x
+revid   : dont't care
+progif_code : don't care
+subclass_code   : don't care
+baseclass_code  : should be 0xff
+cache_line_size : don't care
+subsys_vendor_id : don't care
+subsys_id   : don't care
+interrupt_pin   : Should be 1 - INTA, 2 - INTB, 3 - INTC, 4 -INTD
+msi_interrupts  : Should be 1 to 32 depending on the number of msi interrupts
+  to test
-- 
1.7.9.5

[PATCH 06/37] PCI: dwc: Rename cfg_read/cfg_write to read/write

2017-01-12 Thread Kishon Vijay Abraham I

No functional change. dw_pcie_cfg_read/dw_pcie_cfg_write doesn't do
anything specific to access configuration space. It can be just renamed
to dw_pcie_read/dw_pcie_write and used to read/write data to dbi space.
This is in preparation for added endpoint support to linux kernel.

Cc: Jingoo Han 
Cc: Murali Karicheri 
Cc: Joao Pinto 
Cc: Stanimir Varbanov 
Cc: Pratyush Anand 
Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pci-dra7xx.c  |   16 
 drivers/pci/dwc/pci-exynos.c  |4 ++--
 drivers/pci/dwc/pci-keystone-dw.c |4 ++--
 drivers/pci/dwc/pcie-designware.c |   12 ++--
 drivers/pci/dwc/pcie-designware.h |4 ++--
 drivers/pci/dwc/pcie-qcom.c   |2 +-
 drivers/pci/dwc/pcie-spear13xx.c  |   24 
 7 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index aeeab74..38b0c9a 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -114,22 +114,22 @@ static int dra7xx_pcie_establish_link(struct dra7xx_pcie 
*dra7xx)
}
 
if (dra7xx->link_gen == 1) {
-   dw_pcie_cfg_read(pp->dbi_base + exp_cap_off + PCI_EXP_LNKCAP,
-4, );
+   dw_pcie_read(pp->dbi_base + exp_cap_off + PCI_EXP_LNKCAP,
+4, );
if ((reg & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
reg &= ~((u32)PCI_EXP_LNKCAP_SLS);
reg |= PCI_EXP_LNKCAP_SLS_2_5GB;
-   dw_pcie_cfg_write(pp->dbi_base + exp_cap_off +
- PCI_EXP_LNKCAP, 4, reg);
+   dw_pcie_write(pp->dbi_base + exp_cap_off +
+ PCI_EXP_LNKCAP, 4, reg);
}
 
-   dw_pcie_cfg_read(pp->dbi_base + exp_cap_off + PCI_EXP_LNKCTL2,
-2, );
+   dw_pcie_read(pp->dbi_base + exp_cap_off + PCI_EXP_LNKCTL2,
+2, );
if ((reg & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
reg &= ~((u32)PCI_EXP_LNKCAP_SLS);
reg |= PCI_EXP_LNKCAP_SLS_2_5GB;
-   dw_pcie_cfg_write(pp->dbi_base + exp_cap_off +
- PCI_EXP_LNKCTL2, 2, reg);
+   dw_pcie_write(pp->dbi_base + exp_cap_off +
+ PCI_EXP_LNKCTL2, 2, reg);
}
}
 
diff --git a/drivers/pci/dwc/pci-exynos.c b/drivers/pci/dwc/pci-exynos.c
index c179e7a..e3fbff4 100644
--- a/drivers/pci/dwc/pci-exynos.c
+++ b/drivers/pci/dwc/pci-exynos.c
@@ -429,7 +429,7 @@ static int exynos_pcie_rd_own_conf(struct pcie_port *pp, 
int where, int size,
int ret;
 
exynos_pcie_sideband_dbi_r_mode(exynos_pcie, true);
-   ret = dw_pcie_cfg_read(pp->dbi_base + where, size, val);
+   ret = dw_pcie_read(pp->dbi_base + where, size, val);
exynos_pcie_sideband_dbi_r_mode(exynos_pcie, false);
return ret;
 }
@@ -441,7 +441,7 @@ static int exynos_pcie_wr_own_conf(struct pcie_port *pp, 
int where, int size,
int ret;
 
exynos_pcie_sideband_dbi_w_mode(exynos_pcie, true);
-   ret = dw_pcie_cfg_write(pp->dbi_base + where, size, val);
+   ret = dw_pcie_write(pp->dbi_base + where, size, val);
exynos_pcie_sideband_dbi_w_mode(exynos_pcie, false);
return ret;
 }
diff --git a/drivers/pci/dwc/pci-keystone-dw.c 
b/drivers/pci/dwc/pci-keystone-dw.c
index 9397c46..4875334 100644
--- a/drivers/pci/dwc/pci-keystone-dw.c
+++ b/drivers/pci/dwc/pci-keystone-dw.c
@@ -444,7 +444,7 @@ int ks_dw_pcie_rd_other_conf(struct pcie_port *pp, struct 
pci_bus *bus,
 
addr = ks_pcie_cfg_setup(ks_pcie, bus_num, devfn);
 
-   return dw_pcie_cfg_read(addr + where, size, val);
+   return dw_pcie_read(addr + where, size, val);
 }
 
 int ks_dw_pcie_wr_other_conf(struct pcie_port *pp, struct pci_bus *bus,
@@ -456,7 +456,7 @@ int ks_dw_pcie_wr_other_conf(struct pcie_port *pp, struct 
pci_bus *bus,
 
addr = ks_pcie_cfg_setup(ks_pcie, bus_num, devfn);
 
-   return dw_pcie_cfg_write(addr + where, size, val);
+   return dw_pcie_write(addr + where, size, val);
 }
 
 /**
diff --git a/drivers/pci/dwc/pcie-designware.c 
b/drivers/pci/dwc/pcie-designware.c
index 0b928dc..d0ea310 100644
--- a/drivers/pci/dwc/pcie-designware.c
+++ b/drivers/pci/dwc/pcie-designware.c
@@ -33,7 +33,7 @@
 
 static struct pci_ops dw_pcie_ops;
 
-int dw_pcie_cfg_read(void __iomem *addr, int size, u32 *val)
+int dw_pcie_read(void __iomem *addr, int size, u32 *val)
 {
if ((uintptr_t)addr & (size - 1)) {
*val = 0;
@@ -54,7 +54,7 @@ int dw_pcie_cfg_read(void __iomem *addr, int size,

[PATCH 11/37] PCI: dwc: Split pcie-designware.c into host and core files

2017-01-12 Thread Kishon Vijay Abraham I

Split pcie-designware.c into pcie-designware-host.c that contains
the host specific parts of the driver and pcie-designware.c that
contains the parts used by both host driver and endpoint driver.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/Makefile   |2 +-
 drivers/pci/dwc/pcie-designware-host.c |  619 
 drivers/pci/dwc/pcie-designware.c  |  613 +--
 drivers/pci/dwc/pcie-designware.h  |8 +
 4 files changed, 634 insertions(+), 608 deletions(-)
 create mode 100644 drivers/pci/dwc/pcie-designware-host.c

diff --git a/drivers/pci/dwc/Makefile b/drivers/pci/dwc/Makefile
index 7d27c14..3b57e55 100644
--- a/drivers/pci/dwc/Makefile
+++ b/drivers/pci/dwc/Makefile
@@ -1,4 +1,4 @@
-obj-$(CONFIG_PCIE_DW) += pcie-designware.o
+obj-$(CONFIG_PCIE_DW) += pcie-designware.o pcie-designware-host.o
 obj-$(CONFIG_PCIE_DW_PLAT) += pcie-designware-plat.o
 obj-$(CONFIG_PCI_DRA7XX) += pci-dra7xx.o
 obj-$(CONFIG_PCI_EXYNOS) += pci-exynos.o
diff --git a/drivers/pci/dwc/pcie-designware-host.c 
b/drivers/pci/dwc/pcie-designware-host.c
new file mode 100644
index 000..e7eb653
--- /dev/null
+++ b/drivers/pci/dwc/pcie-designware-host.c
@@ -0,0 +1,619 @@
+/*
+ * Synopsys Designware PCIe host controller driver
+ *
+ * Copyright (C) 2013 Samsung Electronics Co., Ltd.
+ * http://www.samsung.com
+ *
+ * Author: Jingoo Han 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "pcie-designware.h"
+
+static struct pci_ops dw_pcie_ops;
+
+static int dw_pcie_rd_own_conf(struct pcie_port *pp, int where, int size,
+  u32 *val)
+{
+   struct dw_pcie *pci;
+
+   if (pp->ops->rd_own_conf)
+   return pp->ops->rd_own_conf(pp, where, size, val);
+
+   pci = to_dw_pcie_from_pp(pp);
+   return dw_pcie_read(pci->dbi_base + where, size, val);
+}
+
+static int dw_pcie_wr_own_conf(struct pcie_port *pp, int where, int size,
+  u32 val)
+{
+   struct dw_pcie *pci;
+
+   if (pp->ops->wr_own_conf)
+   return pp->ops->wr_own_conf(pp, where, size, val);
+
+   pci = to_dw_pcie_from_pp(pp);
+   return dw_pcie_write(pci->dbi_base + where, size, val);
+}
+
+static struct irq_chip dw_msi_irq_chip = {
+   .name = "PCI-MSI",
+   .irq_enable = pci_msi_unmask_irq,
+   .irq_disable = pci_msi_mask_irq,
+   .irq_mask = pci_msi_mask_irq,
+   .irq_unmask = pci_msi_unmask_irq,
+};
+
+/* MSI int handler */
+irqreturn_t dw_handle_msi_irq(struct pcie_port *pp)
+{
+   unsigned long val;
+   int i, pos, irq;
+   irqreturn_t ret = IRQ_NONE;
+
+   for (i = 0; i < MAX_MSI_CTRLS; i++) {
+   dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_STATUS + i * 12, 4,
+   (u32 *));
+   if (val) {
+   ret = IRQ_HANDLED;
+   pos = 0;
+   while ((pos = find_next_bit(, 32, pos)) != 32) {
+   irq = irq_find_mapping(pp->irq_domain,
+  i * 32 + pos);
+   dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_STATUS +
+   i * 12, 4, 1 << pos);
+   generic_handle_irq(irq);
+   pos++;
+   }
+   }
+   }
+
+   return ret;
+}
+
+void dw_pcie_msi_init(struct pcie_port *pp)
+{
+   u64 msi_target;
+
+   pp->msi_data = __get_free_pages(GFP_KERNEL, 0);
+   msi_target = virt_to_phys((void *)pp->msi_data);
+
+   /* program the msi_data */
+   dw_pcie_wr_own_conf(pp, PCIE_MSI_ADDR_LO, 4,
+   (u32)(msi_target & 0x));
+   dw_pcie_wr_own_conf(pp, PCIE_MSI_ADDR_HI, 4,
+   (u32)(msi_target >> 32 & 0x));
+}
+
+static void dw_pcie_msi_clear_irq(struct pcie_port *pp, int irq)
+{
+   unsigned int res, bit, val;
+
+   res = (irq / 32) * 12;
+   bit = irq % 32;
+   dw_pcie_rd_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, );
+   val &= ~(1 << bit);
+   dw_pcie_wr_own_conf(pp, PCIE_MSI_INTR0_ENABLE + res, 4, val);
+}
+
+static void clear_irq_range(struct pcie_port *pp, unsigned int irq_base,
+   unsigned int nvec, unsigned int pos)
+{
+   unsigned int i;
+
+   for (i = 0; i < nvec; i++) {
+   irq_set_msi_desc_off(irq_base, i, NULL);
+   /* Disable corresponding interrupt on MSI controller */
+   if (pp->ops->msi_clear_irq)
+   pp->ops->msi_clear_irq(pp, pos + i);
+   else
+

[PATCH 22/37] PCI: dwc: Modify dbi accessors to access data of 4/2/1 bytes

2017-01-12 Thread Kishon Vijay Abraham I

Previously dbi accessors can be used to access data of size 4
bytes. But there might be situations (like accessing
MSI_MESSAGE_CONTROL in order to set/get the number of required
MSI interrupts in EP mode) where dbi accessors must
be used to access data of size 2. This is in preparation for
adding endpoint mode support to designware driver.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pci-dra7xx.c   |8 ++--
 drivers/pci/dwc/pci-exynos.c   |   16 +++
 drivers/pci/dwc/pci-imx6.c |   58 +++
 drivers/pci/dwc/pci-keystone-dw.c  |   13 +++---
 drivers/pci/dwc/pcie-armada8k.c|   38 +++
 drivers/pci/dwc/pcie-artpec6.c |6 +--
 drivers/pci/dwc/pcie-designware-host.c |   16 +++
 drivers/pci/dwc/pcie-designware.c  |   79 +++-
 drivers/pci/dwc/pcie-designware.h  |   14 +++---
 drivers/pci/dwc/pcie-hisi.c|   14 +++---
 10 files changed, 140 insertions(+), 122 deletions(-)

diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index 76d0b40..8a1fccd 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -498,9 +498,9 @@ static int dra7xx_pcie_suspend(struct device *dev)
u32 val;
 
/* clear MSE */
-   val = dw_pcie_readl_dbi(pci, base, PCI_COMMAND);
+   val = dw_pcie_read_dbi(pci, base, PCI_COMMAND, 0x4);
val &= ~PCI_COMMAND_MEMORY;
-   dw_pcie_writel_dbi(pci, base, PCI_COMMAND, val);
+   dw_pcie_write_dbi(pci, base, PCI_COMMAND, 0x4, val);
 
return 0;
 }
@@ -513,9 +513,9 @@ static int dra7xx_pcie_resume(struct device *dev)
u32 val;
 
/* set MSE */
-   val = dw_pcie_readl_dbi(pci, base, PCI_COMMAND);
+   val = dw_pcie_read_dbi(pci, base, PCI_COMMAND, 0x4);
val |= PCI_COMMAND_MEMORY;
-   dw_pcie_writel_dbi(pci, base, PCI_COMMAND, val);
+   dw_pcie_write_dbi(pci, base, PCI_COMMAND, 0x4, val);
 
return 0;
 }
diff --git a/drivers/pci/dwc/pci-exynos.c b/drivers/pci/dwc/pci-exynos.c
index a109cf0..f6beb05 100644
--- a/drivers/pci/dwc/pci-exynos.c
+++ b/drivers/pci/dwc/pci-exynos.c
@@ -405,25 +405,25 @@ static void exynos_pcie_enable_interrupts(struct 
exynos_pcie *exynos_pcie)
exynos_pcie_msi_init(exynos_pcie);
 }
 
-static u32 exynos_pcie_readl_dbi(struct dw_pcie *pci, void __iomem *base,
-u32 reg)
+static u32 exynos_pcie_read_dbi(struct dw_pcie *pci, void __iomem *base,
+   u32 reg, int size)
 {
struct exynos_pcie *exynos_pcie = to_exynos_pcie(pci);
u32 val;
 
exynos_pcie_sideband_dbi_r_mode(exynos_pcie, true);
-   val = readl(base + reg);
+   dw_pcie_read(base + reg, size, );
exynos_pcie_sideband_dbi_r_mode(exynos_pcie, false);
return val;
 }
 
-static void exynos_pcie_writel_dbi(struct dw_pcie *pci, void __iomem *base,
-  u32 reg, u32 val)
+static void exynos_pcie_write_dbi(struct dw_pcie *pci, void __iomem *base,
+ u32 reg, int size, u32 val)
 {
struct exynos_pcie *exynos_pcie = to_exynos_pcie(pci);
 
exynos_pcie_sideband_dbi_w_mode(exynos_pcie, true);
-   writel(val, base + reg);
+   dw_pcie_write(base + reg, size, val);
exynos_pcie_sideband_dbi_w_mode(exynos_pcie, false);
 }
 
@@ -530,8 +530,8 @@ static int __init exynos_add_pcie_port(struct exynos_pcie 
*exynos_pcie,
 }
 
 static const struct dw_pcie_ops dw_pcie_ops = {
-   .readl_dbi = exynos_pcie_readl_dbi,
-   .writel_dbi = exynos_pcie_writel_dbi,
+   .read_dbi = exynos_pcie_read_dbi,
+   .write_dbi = exynos_pcie_write_dbi,
.link_up = exynos_pcie_link_up,
 };
 
diff --git a/drivers/pci/dwc/pci-imx6.c b/drivers/pci/dwc/pci-imx6.c
index ecc8690..08ebe62 100644
--- a/drivers/pci/dwc/pci-imx6.c
+++ b/drivers/pci/dwc/pci-imx6.c
@@ -104,7 +104,7 @@ static int pcie_phy_poll_ack(struct imx6_pcie *imx6_pcie, 
int exp_val)
u32 wait_counter = 0;
 
do {
-   val = dw_pcie_readl_dbi(pci, base, PCIE_PHY_STAT);
+   val = dw_pcie_read_dbi(pci, base, PCIE_PHY_STAT, 0x4);
val = (val >> PCIE_PHY_STAT_ACK_LOC) & 0x1;
wait_counter++;
 
@@ -125,17 +125,17 @@ static int pcie_phy_wait_ack(struct imx6_pcie *imx6_pcie, 
int addr)
int ret;
 
val = addr << PCIE_PHY_CTRL_DATA_LOC;
-   dw_pcie_writel_dbi(pci, base, PCIE_PHY_CTRL, val);
+   dw_pcie_write_dbi(pci, base, PCIE_PHY_CTRL, 0x4, val);
 
val |= (0x1 << PCIE_PHY_CTRL_CAP_ADR_LOC);
-   dw_pcie_writel_dbi(pci, base, PCIE_PHY_CTRL, val);
+   dw_pcie_write_dbi(pci, base, PCIE_PHY_CTRL, 0x4, val);
 
ret = pcie_phy_poll_ack(imx6_pcie, 1);
if (ret)
return ret;
 
val = addr << PCIE_PHY_CTRL_DATA_LOC;
-   dw_pcie_writel_dbi(pci, base,

[PATCH 19/37] PCI: endpoint: functions: Add an EP function to test PCI

2017-01-12 Thread Kishon Vijay Abraham I

This adds a new endpoint function driver (to program the virtual
test device) making use of the EP-core library.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/endpoint/Kconfig  |2 +
 drivers/pci/endpoint/Makefile |3 +-
 drivers/pci/endpoint/functions/Kconfig|   12 +
 drivers/pci/endpoint/functions/Makefile   |5 +
 drivers/pci/endpoint/functions/pci-epf-test.c |  513 +
 5 files changed, 534 insertions(+), 1 deletion(-)
 create mode 100644 drivers/pci/endpoint/functions/Kconfig
 create mode 100644 drivers/pci/endpoint/functions/Makefile
 create mode 100644 drivers/pci/endpoint/functions/pci-epf-test.c

diff --git a/drivers/pci/endpoint/Kconfig b/drivers/pci/endpoint/Kconfig
index 930e87a..4195481 100644
--- a/drivers/pci/endpoint/Kconfig
+++ b/drivers/pci/endpoint/Kconfig
@@ -20,4 +20,6 @@ config PCI_ENDPOINT
 
   If in doubt, say "N" to disable Endpoint support.
 
+source "drivers/pci/endpoint/functions/Kconfig"
+
 endmenu
diff --git a/drivers/pci/endpoint/Makefile b/drivers/pci/endpoint/Makefile
index a599c18..cebe3d0 100644
--- a/drivers/pci/endpoint/Makefile
+++ b/drivers/pci/endpoint/Makefile
@@ -3,4 +3,5 @@
 #
 
 obj-$(CONFIG_PCI_ENDPOINT) := pci-epc-core.o pci-epf-core.o\
-  pci-epc-mem.o pci-ep-cfs.o
+  pci-epc-mem.o pci-ep-cfs.o   \
+  functions/
diff --git a/drivers/pci/endpoint/functions/Kconfig 
b/drivers/pci/endpoint/functions/Kconfig
new file mode 100644
index 000..175edad
--- /dev/null
+++ b/drivers/pci/endpoint/functions/Kconfig
@@ -0,0 +1,12 @@
+#
+# PCI Endpoint Functions
+#
+
+config PCI_EPF_TEST
+   tristate "PCI Endpoint Test driver"
+   depends on PCI_ENDPOINT
+   help
+  Enable this configuration option to enable the test driver
+  for PCI Endpoint.
+
+  If in doubt, say "N" to disable Endpoint test driver.
diff --git a/drivers/pci/endpoint/functions/Makefile 
b/drivers/pci/endpoint/functions/Makefile
new file mode 100644
index 000..53c120e
--- /dev/null
+++ b/drivers/pci/endpoint/functions/Makefile
@@ -0,0 +1,5 @@
+#
+# Makefile for PCI Endpoint Functions
+#
+
+obj-$(CONFIG_PCI_EPF_TEST) := pci-epf-test.o
diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c 
b/drivers/pci/endpoint/functions/pci-epf-test.c
new file mode 100644
index 000..bbac323
--- /dev/null
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -0,0 +1,513 @@
+/**
+ * Test driver to test endpoint functionality
+ *
+ * Copyright (C) 2017 Texas Instruments
+ * Author: Kishon Vijay Abraham I 
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 of
+ * the License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#define COMMAND_RAISE_LEGACY_IRQ   BIT(0)
+#define COMMAND_RAISE_MSI_IRQ  BIT(1)
+#define MSI_NUMBER_SHIFT   2
+#define MSI_NUMBER_MASK(0x3f << MSI_NUMBER_SHIFT)
+#define COMMAND_READ   BIT(8)
+#define COMMAND_WRITE  BIT(9)
+#define COMMAND_COPY   BIT(10)
+
+#define STATUS_READ_SUCCESSBIT(0)
+#define STATUS_READ_FAIL   BIT(1)
+#define STATUS_WRITE_SUCCESS   BIT(2)
+#define STATUS_WRITE_FAIL  BIT(3)
+#define STATUS_COPY_SUCCESSBIT(4)
+#define STATUS_COPY_FAIL   BIT(5)
+#define STATUS_IRQ_RAISED  BIT(6)
+#define STATUS_SRC_ADDR_INVALIDBIT(7)
+#define STATUS_DST_ADDR_INVALIDBIT(8)
+
+#define TIMER_RESOLUTION   1
+
+static struct workqueue_struct *kpcitest_workqueue;
+
+struct pci_epf_test {
+   void*reg[6];
+   struct pci_epf  *epf;
+   struct delayed_work cmd_handler;
+};
+
+struct pci_epf_test_reg {
+   u32 magic;
+   u32 command;
+   u32 status;
+   u64 src_addr;
+   u64 dst_addr;
+   u32 size;
+   u32 checksum;
+} __packed;
+
+static struct pci_epf_header test_header = {
+   .vendorid   = PCI_ANY_ID,
+   .deviceid   = PCI_ANY_ID,
+   .baseclass_code = PCI_CLASS_OTHERS,
+   .interrupt_pin  = PCI_INTERRUPT_INTA,
+};
+
+static int bar_size[] = { 512, 1024,

[PATCH 16/37] PCI: endpoint: Introduce configfs entry for configuring EP functions

2017-01-12 Thread Kishon Vijay Abraham I

Introduce a new configfs entry to configure the EP function (like
configuring the standard configuration header entries) and to
bind the EP function with EP controller.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/endpoint/Kconfig  |4 +-
 drivers/pci/endpoint/Makefile |2 +-
 drivers/pci/endpoint/pci-ep-cfs.c |  427 +
 3 files changed, 431 insertions(+), 2 deletions(-)
 create mode 100644 drivers/pci/endpoint/pci-ep-cfs.c

diff --git a/drivers/pci/endpoint/Kconfig b/drivers/pci/endpoint/Kconfig
index 7eb1c79..930e87a 100644
--- a/drivers/pci/endpoint/Kconfig
+++ b/drivers/pci/endpoint/Kconfig
@@ -14,7 +14,9 @@ config PCI_ENDPOINT
 
   Enabling this option will build the endpoint library, which
   includes endpoint controller library and endpoint function
-  library.
+  library. This will also enable the configfs entry required to
+  configure the endpoint function and used to bind the
+  function with a endpoint controller.
 
   If in doubt, say "N" to disable Endpoint support.
 
diff --git a/drivers/pci/endpoint/Makefile b/drivers/pci/endpoint/Makefile
index eeef1b7..a599c18 100644
--- a/drivers/pci/endpoint/Makefile
+++ b/drivers/pci/endpoint/Makefile
@@ -3,4 +3,4 @@
 #
 
 obj-$(CONFIG_PCI_ENDPOINT) := pci-epc-core.o pci-epf-core.o\
-  pci-epc-mem.o
+  pci-epc-mem.o pci-ep-cfs.o
diff --git a/drivers/pci/endpoint/pci-ep-cfs.c 
b/drivers/pci/endpoint/pci-ep-cfs.c
new file mode 100644
index 000..ed0f8c2
--- /dev/null
+++ b/drivers/pci/endpoint/pci-ep-cfs.c
@@ -0,0 +1,427 @@
+/**
+ * configfs to configure the PCI endpoint
+ *
+ * Copyright (C) 2017 Texas Instruments
+ * Author: Kishon Vijay Abraham I 
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 of
+ * the License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+struct pci_epf_info {
+   struct config_group group;
+   struct list_head list;
+   struct pci_epf *epf;
+};
+
+struct pci_ep_info {
+   struct config_group group;
+   struct config_group pci_epf_group;
+   /* mutex to protect pci_epf list */
+   struct mutex lock;
+   struct list_head pci_epf;
+   const char *epc_name;
+   struct pci_epc *epc;
+};
+
+static inline struct pci_epf_info *to_pci_epf_info(struct config_item *item)
+{
+   return container_of(to_config_group(item), struct pci_epf_info, group);
+}
+
+static inline struct pci_ep_info *to_pci_ep_info(struct config_item *item)
+{
+   return container_of(to_config_group(item), struct pci_ep_info, group);
+}
+
+#define PCI_EPF_HEADER_R(_name)
   \
+static ssize_t pci_epf_##_name##_show(struct config_item *item,char 
*page)\
+{ \
+   struct pci_epf *epf = to_pci_epf_info(item)->epf;  \
+   if (!epf->header) {\
+   WARN_ON_ONCE("epf device not bound to function driver\n"); \
+   return 0;  \
+   }  \
+   return sprintf(page, "0x%04x\n", epf->header->_name);  \
+}
+
+#define PCI_EPF_HEADER_W_u32(_name)   \
+static ssize_t pci_epf_##_name##_store(struct config_item *item,  \
+  const char *page, size_t len)   \
+{ \
+   u32 val;   \
+   int ret;   \
+   struct pci_epf *epf = to_pci_epf_info(item)->epf;  \
+   if (!epf->header) {\
+   WARN_ON_ONCE("epf device not bound to function driver\n"); \
+   return 0;  \
+   }  \
+   ret = kstrtou32(page, 0, );\
+

[PATCH 15/37] Documentation: PCI: Guide to use PCI Endpoint Core Layer

2017-01-12 Thread Kishon Vijay Abraham I

Add Documentation to help users use endpoint library to enable endpoint
mode in the PCI controller and add new PCI endpoint functions.

Signed-off-by: Kishon Vijay Abraham I 
---
 Documentation/PCI/00-INDEX  |2 +
 Documentation/PCI/endpoint/pci-endpoint.txt |  190 +++
 2 files changed, 192 insertions(+)
 create mode 100644 Documentation/PCI/endpoint/pci-endpoint.txt

diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
index 147231f..ba950b2 100644
--- a/Documentation/PCI/00-INDEX
+++ b/Documentation/PCI/00-INDEX
@@ -12,3 +12,5 @@ pci.txt
- info on the PCI subsystem for device driver authors
 pcieaer-howto.txt
- the PCI Express Advanced Error Reporting Driver Guide HOWTO
+endpoint/pci-endpoint.txt
+   - guide to add endpoint controller driver and endpoint function driver.
diff --git a/Documentation/PCI/endpoint/pci-endpoint.txt 
b/Documentation/PCI/endpoint/pci-endpoint.txt
new file mode 100644
index 000..68a7839
--- /dev/null
+++ b/Documentation/PCI/endpoint/pci-endpoint.txt
@@ -0,0 +1,190 @@
+   PCI ENDPOINT FRAMEWORK
+   Kishon Vijay Abraham I 
+
+This document is a guide to use the PCI Endpoint Framework in order to create
+endpoint controller driver, endpoint function driver and using configfs
+interface to bind the function driver to the controller driver.
+
+1. Introduction
+
+*Linux* has a comprehensive PCI subsystem to support PCI controllers that
+operates in Root Complex mode. The subsystem has capability to scan PCI bus,
+assign memory resources and irq resources, load PCI driver (based on
+vendorid, deviceid), support other services like hot-plug, power management,
+advanced error reporting and virtual channels.
+
+However PCI controller IPs integrated in certain SoC is capable of operating
+either in Root Complex mode or Endpoint mode. PCI Endpoint Framework will
+add endpoint mode support in *Linux*. This will help to run Linux in an
+EP system which can have a wide variety of use cases from testing or
+validation, co-processor accelerator etc..
+
+2. PCI Endpoint Core
+
+The PCI Endpoint Core layer comprises of 3 components: the Endpoint Controller
+library, the Endpoint Function library and the configfs layer to bind the
+endpoint function with the endpoint controller.
+
+2.1 PCI Endpoint Controller(EPC) Library
+
+The EPC library provides APIs to be used by the controller that can operate
+in endpoint mode. It also provides APIs to be used by function driver/library
+in order to implement a particular endpoint function.
+
+2.1.1 APIs for the PCI controller Driver
+
+This section lists the APIs that the PCI Endpoint core provides to be used
+by the PCI controller driver.
+
+*) devm_pci_epc_create()/pci_epc_create()
+
+   The PCI controller driver should implement the following ops:
+* write_header: ops to populate configuration space header
+* set_bar: ops to configure the BAR
+* clear_bar: ops to reset the BAR
+* alloc_addr_space: ops to allocate *in* PCI controller address space
+* free_addr_space: ops to free the allocated address space
+* raise_irq: ops to raise a legacy or MSI interrupt
+* start: ops to start the PCI link
+* stop: ops to stop the PCI link
+
+   The PCI controller driver can then create a new EPC device by invoking
+   devm_pci_epc_create/pci_epc_create.
+
+*) devm_pci_epc_destroy()/pci_epc_destroy()
+
+   The PCI controller driver can destroy the EPC device created by either
+   devm_pci_epc_create or pci_epc_create using devm_pci_epc_destroy() or
+   /pci_epc_destroy()
+
+2.1.2 APIs for the PCI Endpoint Function Driver
+
+This section lists the APIs that the PCI Endpoint core provides to be used
+by the PCI endpoint function driver.
+
+*) pci_epc_write_header()
+
+   The PCI endpoint function driver should use pci_epc_write_header() to
+   write the standard configuration header to the endpoint controller.
+
+*) pci_epc_set_bar()
+
+   The PCI endpoint function driver should use pci_epc_set_bar() to configure
+   the Base Address Register in order for the host to assign PCI addr space.
+   Register space of the function driver is usually configured
+   using this API.
+
+*) pci_epc_clear_bar()
+
+   The PCI endpoint function driver should use pci_epc_clear_bar() to reset
+   the BAR.
+
+*) pci_epc_raise_irq()
+
+   The PCI endpoint function driver should use pci_epc_raise_irq() to raise
+   Legacy Interrupt or MSI Interrupt.
+
+*) pci_epc_start()
+
+   The PCI endpoint function driver should invoke pci_epc_start() once it
+   has configured the endpoint function and wants to start the PCI link.
+
+*) pci_epc_stop()
+
+   The PCI endpoint function driver should invoke pci_epc_stop() to stop
+   the PCI LINK.
+
+2.1.3 Other APIs
+
+There are other APIs provided by the EPC library. These are used for binding
+the epf device with epc device. pci-ep-cfs.c

[PATCH 04/37] PCI: dwc: designware: Move the register defines to designware header file

2017-01-12 Thread Kishon Vijay Abraham I

No functional change. Move the register defines and other macros from
pcie-designware.c to pcie-designware.h. This is in preparation to
split the pcie-designware.c file into designware core file and host
specific file.

While at that also fix a checkpatch warning.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pcie-designware.c |   70 
 drivers/pci/dwc/pcie-designware.h |   71 +
 2 files changed, 71 insertions(+), 70 deletions(-)

diff --git a/drivers/pci/dwc/pcie-designware.c 
b/drivers/pci/dwc/pcie-designware.c
index d68bc7b..0b928dc 100644
--- a/drivers/pci/dwc/pcie-designware.c
+++ b/drivers/pci/dwc/pcie-designware.c
@@ -25,76 +25,6 @@
 
 #include "pcie-designware.h"
 
-/* Parameters for the waiting for link up routine */
-#define LINK_WAIT_MAX_RETRIES  10
-#define LINK_WAIT_USLEEP_MIN   9
-#define LINK_WAIT_USLEEP_MAX   10
-
-/* Parameters for the waiting for iATU enabled routine */
-#define LINK_WAIT_MAX_IATU_RETRIES 5
-#define LINK_WAIT_IATU_MIN 9000
-#define LINK_WAIT_IATU_MAX 1
-
-/* Synopsys-specific PCIe configuration registers */
-#define PCIE_PORT_LINK_CONTROL 0x710
-#define PORT_LINK_MODE_MASK(0x3f << 16)
-#define PORT_LINK_MODE_1_LANES (0x1 << 16)
-#define PORT_LINK_MODE_2_LANES (0x3 << 16)
-#define PORT_LINK_MODE_4_LANES (0x7 << 16)
-#define PORT_LINK_MODE_8_LANES (0xf << 16)
-
-#define PCIE_LINK_WIDTH_SPEED_CONTROL  0x80C
-#define PORT_LOGIC_SPEED_CHANGE(0x1 << 17)
-#define PORT_LOGIC_LINK_WIDTH_MASK (0x1f << 8)
-#define PORT_LOGIC_LINK_WIDTH_1_LANES  (0x1 << 8)
-#define PORT_LOGIC_LINK_WIDTH_2_LANES  (0x2 << 8)
-#define PORT_LOGIC_LINK_WIDTH_4_LANES  (0x4 << 8)
-#define PORT_LOGIC_LINK_WIDTH_8_LANES  (0x8 << 8)
-
-#define PCIE_MSI_ADDR_LO   0x820
-#define PCIE_MSI_ADDR_HI   0x824
-#define PCIE_MSI_INTR0_ENABLE  0x828
-#define PCIE_MSI_INTR0_MASK0x82C
-#define PCIE_MSI_INTR0_STATUS  0x830
-
-#define PCIE_ATU_VIEWPORT  0x900
-#define PCIE_ATU_REGION_INBOUND(0x1 << 31)
-#define PCIE_ATU_REGION_OUTBOUND   (0x0 << 31)
-#define PCIE_ATU_REGION_INDEX2 (0x2 << 0)
-#define PCIE_ATU_REGION_INDEX1 (0x1 << 0)
-#define PCIE_ATU_REGION_INDEX0 (0x0 << 0)
-#define PCIE_ATU_CR1   0x904
-#define PCIE_ATU_TYPE_MEM  (0x0 << 0)
-#define PCIE_ATU_TYPE_IO   (0x2 << 0)
-#define PCIE_ATU_TYPE_CFG0 (0x4 << 0)
-#define PCIE_ATU_TYPE_CFG1 (0x5 << 0)
-#define PCIE_ATU_CR2   0x908
-#define PCIE_ATU_ENABLE(0x1 << 31)
-#define PCIE_ATU_BAR_MODE_ENABLE   (0x1 << 30)
-#define PCIE_ATU_LOWER_BASE0x90C
-#define PCIE_ATU_UPPER_BASE0x910
-#define PCIE_ATU_LIMIT 0x914
-#define PCIE_ATU_LOWER_TARGET  0x918
-#define PCIE_ATU_BUS(x)(((x) & 0xff) << 24)
-#define PCIE_ATU_DEV(x)(((x) & 0x1f) << 19)
-#define PCIE_ATU_FUNC(x)   (((x) & 0x7) << 16)
-#define PCIE_ATU_UPPER_TARGET  0x91C
-
-/*
- * iATU Unroll-specific register definitions
- * From 4.80 core version the address translation will be made by unroll
- */
-#define PCIE_ATU_UNR_REGION_CTRL1  0x00
-#define PCIE_ATU_UNR_REGION_CTRL2  0x04
-#define PCIE_ATU_UNR_LOWER_BASE0x08
-#define PCIE_ATU_UNR_UPPER_BASE0x0C
-#define PCIE_ATU_UNR_LIMIT 0x10
-#define PCIE_ATU_UNR_LOWER_TARGET  0x14
-#define PCIE_ATU_UNR_UPPER_TARGET  0x18
-
-/* Register address builder */
-#define PCIE_GET_ATU_OUTB_UNR_REG_OFFSET(region)  ((0x3 << 20) | (region << 9))
-
 /* PCIe Port Logic registers */
 #define PLR_OFFSET 0x700
 #define PCIE_PHY_DEBUG_R1  (PLR_OFFSET + 0x2c)
diff --git a/drivers/pci/dwc/pcie-designware.h 
b/drivers/pci/dwc/pcie-designware.h
index 32f4602..a6cf9262 100644
--- a/drivers/pci/dwc/pcie-designware.h
+++ b/drivers/pci/dwc/pcie-designware.h
@@ -14,6 +14,77 @@
 #ifndef _PCIE_DESIGNWARE_H
 #define _PCIE_DESIGNWARE_H
 
+/* Parameters for the waiting for link up routine */
+#define LINK_WAIT_MAX_RETRIES  10
+#define LINK_WAIT_USLEEP_MIN   9
+#define LINK_WAIT_USLEEP_MAX   10
+
+/* Parameters for the waiting for iATU enabled routine */
+#define LINK_WAIT_MAX_IATU_RETRIES 5
+#define LINK_WAIT_IATU_MIN 9000
+#define LINK_WAIT_IATU_MAX 1
+
+/* Synopsys-specific PCIe configuration registers */
+#define PCIE_PORT_LINK_CONTROL 0x710
+#define PORT_LINK_MODE_MASK(0x3f << 16)
+#define PORT_LINK_MODE_1_LANES (0x1 << 16)
+#define PORT_LINK_MODE_2_LANES (0x3 << 16)
+#define PORT_LINK_MODE_4_LANES (0x7 << 16)
+#define PORT_LINK_MODE_8_LANES

[PATCH 12/37] PCI: dwc: Create a new config symbol to enable pci dwc host

2017-01-12 Thread Kishon Vijay Abraham I

Now that pci designware host has a separate file, create a new
config symbol to select the host only driver. This is in preparation
to enable endpoint support to designware driver.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/Kconfig   |   26 +++---
 drivers/pci/dwc/Makefile  |3 ++-
 drivers/pci/dwc/pcie-designware.h |   29 +
 3 files changed, 42 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/dwc/Kconfig b/drivers/pci/dwc/Kconfig
index 8b08519..d0bdfb5 100644
--- a/drivers/pci/dwc/Kconfig
+++ b/drivers/pci/dwc/Kconfig
@@ -3,13 +3,17 @@ menu "DesignWare PCI Core Support"
 
 config PCIE_DW
bool
+
+config PCIE_DW_HOST
+bool
depends on PCI_MSI_IRQ_DOMAIN
+select PCIE_DW
 
 config PCI_DRA7XX
bool "TI DRA7xx PCIe controller"
depends on OF && HAS_IOMEM && TI_PIPE3
depends on PCI_MSI_IRQ_DOMAIN
-   select PCIE_DW
+   select PCIE_DW_HOST
help
 Enables support for the PCIe controller in the DRA7xx SoC.  There
 are two instances of PCIe controller in DRA7xx.  This controller can
@@ -18,7 +22,7 @@ config PCI_DRA7XX
 config PCIE_DW_PLAT
bool "Platform bus based DesignWare PCIe Controller"
depends on PCI_MSI_IRQ_DOMAIN
-   select PCIE_DW
+   select PCIE_DW_HOST
---help---
 This selects the DesignWare PCIe controller support. Select this if
 you have a PCIe controller on Platform bus.
@@ -32,21 +36,21 @@ config PCI_EXYNOS
depends on SOC_EXYNOS5440 || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
-   select PCIE_DW
+   select PCIE_DW_HOST
 
 config PCI_IMX6
bool "Freescale i.MX6 PCIe controller"
depends on SOC_IMX6Q || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
-   select PCIE_DW
+   select PCIE_DW_HOST
 
 config PCIE_SPEAR13XX
bool "STMicroelectronics SPEAr PCIe controller"
depends on ARCH_SPEAR13XX || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
-   select PCIE_DW
+   select PCIE_DW_HOST
help
  Say Y here if you want PCIe support on SPEAr13XX SoCs.
 
@@ -55,7 +59,7 @@ config PCI_KEYSTONE
depends on ARCH_KEYSTONE || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
-   select PCIE_DW
+   select PCIE_DW_HOST
help
  Say Y here if you want to enable PCI controller support on Keystone
  SoCs. The PCI controller on Keystone is based on Designware hardware
@@ -67,7 +71,7 @@ config PCI_LAYERSCAPE
depends on OF && (ARM || ARCH_LAYERSCAPE || COMPILE_TEST)
depends on PCI_MSI_IRQ_DOMAIN
select MFD_SYSCON
-   select PCIE_DW
+   select PCIE_DW_HOST
help
  Say Y here if you want PCIe controller support on Layerscape SoCs.
 
@@ -76,7 +80,7 @@ config PCI_HISI
bool "HiSilicon Hip05 and Hip06 SoCs PCIe controllers"
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
-   select PCIE_DW
+   select PCIE_DW_HOST
help
  Say Y here if you want PCIe controller support on HiSilicon
  Hip05 and Hip06 SoCs
@@ -86,7 +90,7 @@ config PCIE_QCOM
depends on (ARCH_QCOM || COMPILE_TEST) && OF
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
-   select PCIE_DW
+   select PCIE_DW_HOST
help
  Say Y here to enable PCIe controller support on Qualcomm SoCs. The
  PCIe controller uses the Designware core plus Qualcomm-specific
@@ -97,7 +101,7 @@ config PCIE_ARMADA_8K
depends on ARCH_MVEBU || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
-   select PCIE_DW
+   select PCIE_DW_HOST
help
  Say Y here if you want to enable PCIe controller support on
  Armada-8K SoCs. The PCIe controller on Armada-8K is based on
@@ -109,7 +113,7 @@ config PCIE_ARTPEC6
depends on MACH_ARTPEC6 || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
-   select PCIE_DW
+   select PCIE_DW_HOST
help
  Say Y here to enable PCIe controller support on Axis ARTPEC-6
  SoCs.  This PCIe controller uses the DesignWare core.
diff --git a/drivers/pci/dwc/Makefile b/drivers/pci/dwc/Makefile
index 3b57e55..a2df13c 100644
--- a/drivers/pci/dwc/Makefile
+++ b/drivers/pci/dwc/Makefile
@@ -1,4 +1,5 @@
-obj-$(CONFIG_PCIE_DW) += pcie-designware.o pcie-designware-host.o
+obj-$(CONFIG_PCIE_DW) += pcie-designware.o
+obj-$(CONFIG_PCIE_DW_HOST) += pcie-designware-host.o
 obj-$(CONFIG_PCIE_DW_PLAT) += pcie-designware-plat.o
 obj-$(CONFIG_PCI_DRA7XX) += pci-dra7xx.o
 obj-$(CONFIG_PCI_EXYNOS) += pci-exynos.o
diff --git a/drivers/pci/dwc/pcie-designware.h 
b/drivers/pci/dwc/pcie-designware.h
index 808d17b..8f3dcb2 100644
---

[PATCH 28/37] dt-bindings: PCI: dra7xx: Add dt bindings for pci dra7xx EP mode

2017-01-12 Thread Kishon Vijay Abraham I

Add device tree binding documentation for pci dra7xx EP mode.

Signed-off-by: Kishon Vijay Abraham I 
---
 Documentation/devicetree/bindings/pci/ti-pci.txt |   37 ++
 1 file changed, 30 insertions(+), 7 deletions(-)

diff --git a/Documentation/devicetree/bindings/pci/ti-pci.txt 
b/Documentation/devicetree/bindings/pci/ti-pci.txt
index 60e2516..62f5f59 100644
--- a/Documentation/devicetree/bindings/pci/ti-pci.txt
+++ b/Documentation/devicetree/bindings/pci/ti-pci.txt
@@ -1,17 +1,22 @@
 TI PCI Controllers
 
 PCIe Designware Controller
- - compatible: Should be "ti,dra7-pcie""
- - reg : Two register ranges as listed in the reg-names property
- - reg-names : The first entry must be "ti-conf" for the TI specific registers
-  The second entry must be "rc-dbics" for the designware pcie
-  registers
-  The third entry must be "config" for the PCIe configuration space
+ - compatible: Should be "ti,dra7-pcie" for RC
+  Should be "ti,dra7-pcie-ep" for EP
  - phys : list of PHY specifiers (used by generic PHY framework)
  - phy-names : must be "pcie-phy0", "pcie-phy1", "pcie-phyN".. based on the
   number of PHYs as specified in *phys* property.
  - ti,hwmods : Name of the hwmod associated to the pcie, "pcie",
   where  is the instance number of the pcie from the HW spec.
+ - num-lanes as specified in ../designware-pcie.txt
+
+HOST MODE
+=
+ - reg : Two register ranges as listed in the reg-names property
+ - reg-names : The first entry must be "ti-conf" for the TI specific registers
+  The second entry must be "rc-dbics" for the designware pcie
+  registers
+  The third entry must be "config" for the PCIe configuration space
  - interrupts : Two interrupt entries must be specified. The first one is for
main interrupt line and the second for MSI interrupt line.
  - #address-cells,
@@ -19,13 +24,31 @@ PCIe Designware Controller
#interrupt-cells,
device_type,
ranges,
-   num-lanes,
interrupt-map-mask,
interrupt-map : as specified in ../designware-pcie.txt
 
+DEVICE MODE
+===
+ - reg : Four register ranges as listed in the reg-names property
+ - reg-names : "ti-conf" for the TI specific registers
+  "ep_dbics" for the standard configuration registers as
+   they are locally accessed within the DIF CS space
+  "ep_dbics2" for the standard configuration registers as
+   they are locally accessed within the DIF CS2 space
+  "addr_space" used to map remote RC address space
+ - interrupts : one interrupt entries must be specified for main interrupt.
+ - num-ib-windows : number of inbound address translation windows
+ - num-ob-windows : number of outbound address translation windows
+
 Optional Property:
  - gpios : Should be added if a gpio line is required to drive PERST# line
 
+NOTE: Two dt nodes should be added for each PCI controller; one for host
+mode and another for device mode. So in order for PCI to
+work in host mode, EP mode dt node should be disabled and in order to PCI to
+work in EP mode, host mode dt node should be disabled. And host mode and EP
+mode are mutually exclusive.
+
 Example:
 axi {
compatible = "simple-bus";
-- 
1.7.9.5

[PATCH 30/37] dt-bindings: PCI: dra7xx: Add dt bindings to enable legacy mode

2017-01-12 Thread Kishon Vijay Abraham I

Update device tree binding documentation of TI's dra7xx PCI
controller to include property for enabling legacy mode.

Signed-off-by: Kishon Vijay Abraham I 
---
 Documentation/devicetree/bindings/pci/ti-pci.txt |4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/devicetree/bindings/pci/ti-pci.txt 
b/Documentation/devicetree/bindings/pci/ti-pci.txt
index 62f5f59..ed85e8e 100644
--- a/Documentation/devicetree/bindings/pci/ti-pci.txt
+++ b/Documentation/devicetree/bindings/pci/ti-pci.txt
@@ -39,6 +39,10 @@ DEVICE MODE
  - interrupts : one interrupt entries must be specified for main interrupt.
  - num-ib-windows : number of inbound address translation windows
  - num-ob-windows : number of outbound address translation windows
+ - syscon-legacy-mode: phandle to the syscon dt node. The 1st argument should
+  contain the register offset within syscon and the 2nd
+  argument should contain the bit field for setting the
+  legacy mode
 
 Optional Property:
  - gpios : Should be added if a gpio line is required to drive PERST# line
-- 
1.7.9.5

[PATCH 36/37] ARM: DRA7: clockdomain: Change the CLKTRCTRL of CM_PCIE_CLKSTCTRL to SW_WKUP

2017-01-12 Thread Kishon Vijay Abraham I

The PCIe programming sequence in TRM suggests CLKSTCTRL of PCIe should
be set to SW_WKUP. There are no issues when CLKSTCTRL is set to HW_AUTO
in RC mode. However in EP mode, the host system is not able to access the
MEMSPACE and setting the CLKSTCTRL to SW_WKUP fixes it.

Signed-off-by: Kishon Vijay Abraham I 
---
 arch/arm/mach-omap2/clockdomains7xx_data.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-omap2/clockdomains7xx_data.c 
b/arch/arm/mach-omap2/clockdomains7xx_data.c
index 6c67965..67ebff8 100644
--- a/arch/arm/mach-omap2/clockdomains7xx_data.c
+++ b/arch/arm/mach-omap2/clockdomains7xx_data.c
@@ -524,7 +524,7 @@
.dep_bit  = DRA7XX_PCIE_STATDEP_SHIFT,
.wkdep_srcs   = pcie_wkup_sleep_deps,
.sleepdep_srcs= pcie_wkup_sleep_deps,
-   .flags= CLKDM_CAN_HWSUP_SWSUP,
+   .flags= CLKDM_CAN_SWSUP,
 };
 
 static struct clockdomain atl_7xx_clkdm = {
-- 
1.7.9.5

[PATCH 35/37] MAINTAINERS: add PCI EP maintainer

2017-01-12 Thread Kishon Vijay Abraham I

Add maintainer for the newly introduced PCI EP framework.

Signed-off-by: Kishon Vijay Abraham I 
---
 MAINTAINERS |9 +
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8672f18..021f676 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9407,6 +9407,15 @@ F:   include/linux/pci*
 F: arch/x86/pci/
 F: arch/x86/kernel/quirks.c
 
+PCI EP SUBSYSTEM
+M: Kishon Vijay Abraham I 
+L: linux-...@vger.kernel.org
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git
+S: Supported
+F: drivers/pci/endpoint/
+F: drivers/misc/pci_endpoint_test.c
+F: tools/pci/
+
 PCI DRIVER FOR ALTERA PCIE IP
 M: Ley Foon Tan 
 L: r...@lists.rocketboards.org (moderated for non-subscribers)
-- 
1.7.9.5

[PATCH 21/37] PCI: dwc: Modify dbi accessors to take dbi_base as argument

2017-01-12 Thread Kishon Vijay Abraham I

dwc has 2 dbi address space labelled dbics and dbics2. The existing
helper to access dbi address space can access only dbics. However
dbics2 has to be accessed for programming the BAR registers in the
case of EP mode. This is in preparation for adding EP mode support
to dwc driver.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pci-dra7xx.c   |   10 +++--
 drivers/pci/dwc/pci-exynos.c   |   10 +++--
 drivers/pci/dwc/pci-imx6.c |   67 
 drivers/pci/dwc/pci-keystone-dw.c  |   15 ---
 drivers/pci/dwc/pcie-armada8k.c|   39 +---
 drivers/pci/dwc/pcie-artpec6.c |7 +--
 drivers/pci/dwc/pcie-designware-host.c |   17 +++
 drivers/pci/dwc/pcie-designware.c  |   76 ++--
 drivers/pci/dwc/pcie-designware.h  |   10 +++--
 drivers/pci/dwc/pcie-hisi.c|   17 ---
 10 files changed, 153 insertions(+), 115 deletions(-)

diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index 3c525b0..76d0b40 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -494,12 +494,13 @@ static int dra7xx_pcie_suspend(struct device *dev)
 {
struct dra7xx_pcie *dra7xx = dev_get_drvdata(dev);
struct dw_pcie *pci = dra7xx->pci;
+   void __iomem *base = pci->dbi_base;
u32 val;
 
/* clear MSE */
-   val = dw_pcie_readl_dbi(pci, PCI_COMMAND);
+   val = dw_pcie_readl_dbi(pci, base, PCI_COMMAND);
val &= ~PCI_COMMAND_MEMORY;
-   dw_pcie_writel_dbi(pci, PCI_COMMAND, val);
+   dw_pcie_writel_dbi(pci, base, PCI_COMMAND, val);
 
return 0;
 }
@@ -508,12 +509,13 @@ static int dra7xx_pcie_resume(struct device *dev)
 {
struct dra7xx_pcie *dra7xx = dev_get_drvdata(dev);
struct dw_pcie *pci = dra7xx->pci;
+   void __iomem *base = pci->dbi_base;
u32 val;
 
/* set MSE */
-   val = dw_pcie_readl_dbi(pci, PCI_COMMAND);
+   val = dw_pcie_readl_dbi(pci, base, PCI_COMMAND);
val |= PCI_COMMAND_MEMORY;
-   dw_pcie_writel_dbi(pci, PCI_COMMAND, val);
+   dw_pcie_writel_dbi(pci, base, PCI_COMMAND, val);
 
return 0;
 }
diff --git a/drivers/pci/dwc/pci-exynos.c b/drivers/pci/dwc/pci-exynos.c
index 0295ec9..a109cf0 100644
--- a/drivers/pci/dwc/pci-exynos.c
+++ b/drivers/pci/dwc/pci-exynos.c
@@ -405,23 +405,25 @@ static void exynos_pcie_enable_interrupts(struct 
exynos_pcie *exynos_pcie)
exynos_pcie_msi_init(exynos_pcie);
 }
 
-static u32 exynos_pcie_readl_dbi(struct dw_pcie *pci, u32 reg)
+static u32 exynos_pcie_readl_dbi(struct dw_pcie *pci, void __iomem *base,
+u32 reg)
 {
struct exynos_pcie *exynos_pcie = to_exynos_pcie(pci);
u32 val;
 
exynos_pcie_sideband_dbi_r_mode(exynos_pcie, true);
-   val = readl(pci->dbi_base + reg);
+   val = readl(base + reg);
exynos_pcie_sideband_dbi_r_mode(exynos_pcie, false);
return val;
 }
 
-static void exynos_pcie_writel_dbi(struct dw_pcie *pci, u32 reg, u32 val)
+static void exynos_pcie_writel_dbi(struct dw_pcie *pci, void __iomem *base,
+  u32 reg, u32 val)
 {
struct exynos_pcie *exynos_pcie = to_exynos_pcie(pci);
 
exynos_pcie_sideband_dbi_w_mode(exynos_pcie, true);
-   writel(val, pci->dbi_base + reg);
+   writel(val, base + reg);
exynos_pcie_sideband_dbi_w_mode(exynos_pcie, false);
 }
 
diff --git a/drivers/pci/dwc/pci-imx6.c b/drivers/pci/dwc/pci-imx6.c
index 70fa380..ecc8690 100644
--- a/drivers/pci/dwc/pci-imx6.c
+++ b/drivers/pci/dwc/pci-imx6.c
@@ -98,12 +98,13 @@ struct imx6_pcie {
 static int pcie_phy_poll_ack(struct imx6_pcie *imx6_pcie, int exp_val)
 {
struct dw_pcie *pci = imx6_pcie->pci;
+   void __iomem *base = pci->dbi_base;
u32 val;
u32 max_iterations = 10;
u32 wait_counter = 0;
 
do {
-   val = dw_pcie_readl_dbi(pci, PCIE_PHY_STAT);
+   val = dw_pcie_readl_dbi(pci, base, PCIE_PHY_STAT);
val = (val >> PCIE_PHY_STAT_ACK_LOC) & 0x1;
wait_counter++;
 
@@ -119,21 +120,22 @@ static int pcie_phy_poll_ack(struct imx6_pcie *imx6_pcie, 
int exp_val)
 static int pcie_phy_wait_ack(struct imx6_pcie *imx6_pcie, int addr)
 {
struct dw_pcie *pci = imx6_pcie->pci;
+   void __iomem *base = pci->dbi_base;
u32 val;
int ret;
 
val = addr << PCIE_PHY_CTRL_DATA_LOC;
-   dw_pcie_writel_dbi(pci, PCIE_PHY_CTRL, val);
+   dw_pcie_writel_dbi(pci, base, PCIE_PHY_CTRL, val);
 
val |= (0x1 << PCIE_PHY_CTRL_CAP_ADR_LOC);
-   dw_pcie_writel_dbi(pci, PCIE_PHY_CTRL, val);
+   dw_pcie_writel_dbi(pci, base, PCIE_PHY_CTRL, val);
 
ret = pcie_phy_poll_ack(imx6_pcie, 1);
if (ret)
return ret;
 
val = addr << PCIE_PHY_CTRL_DATA_LOC;
-

[PATCH 07/37] PCI: dwc: designware: Get device pointer at the start of dw_pcie_host_init

2017-01-12 Thread Kishon Vijay Abraham I

No functional change. Get device pointer at the beginning of
dw_pcie_host_init instead of getting it all over dw_pcie_host_init.
This is in preparation for splitting struct pcie_port into host and
core structures (Once split pcie_port will not have device pointer).

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pcie-designware.c |   33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/pci/dwc/pcie-designware.c 
b/drivers/pci/dwc/pcie-designware.c
index d0ea310..330596b 100644
--- a/drivers/pci/dwc/pcie-designware.c
+++ b/drivers/pci/dwc/pcie-designware.c
@@ -449,8 +449,9 @@ static u8 dw_pcie_iatu_unroll_enabled(struct pcie_port *pp)
 
 int dw_pcie_host_init(struct pcie_port *pp)
 {
-   struct device_node *np = pp->dev->of_node;
-   struct platform_device *pdev = to_platform_device(pp->dev);
+   struct device *dev = pp->dev;
+   struct device_node *np = dev->of_node;
+   struct platform_device *pdev = to_platform_device(dev);
struct pci_bus *bus, *child;
struct resource *cfg_res;
int i, ret;
@@ -464,14 +465,14 @@ int dw_pcie_host_init(struct pcie_port *pp)
pp->cfg0_base = cfg_res->start;
pp->cfg1_base = cfg_res->start + pp->cfg0_size;
} else if (!pp->va_cfg0_base) {
-   dev_err(pp->dev, "missing *config* reg space\n");
+   dev_err(dev, "missing *config* reg space\n");
}
 
ret = of_pci_get_host_bridge_resources(np, 0, 0xff, , >io_base);
if (ret)
return ret;
 
-   ret = devm_request_pci_bus_resources(>dev, );
+   ret = devm_request_pci_bus_resources(dev, );
if (ret)
goto error;
 
@@ -481,7 +482,7 @@ int dw_pcie_host_init(struct pcie_port *pp)
case IORESOURCE_IO:
ret = pci_remap_iospace(win->res, pp->io_base);
if (ret) {
-   dev_warn(pp->dev, "error %d: failed to map 
resource %pR\n",
+   dev_warn(dev, "error %d: failed to map resource 
%pR\n",
 ret, win->res);
resource_list_destroy_entry(win);
} else {
@@ -511,10 +512,10 @@ int dw_pcie_host_init(struct pcie_port *pp)
}
 
if (!pp->dbi_base) {
-   pp->dbi_base = devm_ioremap(pp->dev, pp->cfg->start,
+   pp->dbi_base = devm_ioremap(dev, pp->cfg->start,
resource_size(pp->cfg));
if (!pp->dbi_base) {
-   dev_err(pp->dev, "error with ioremap\n");
+   dev_err(dev, "error with ioremap\n");
ret = -ENOMEM;
goto error;
}
@@ -523,20 +524,20 @@ int dw_pcie_host_init(struct pcie_port *pp)
pp->mem_base = pp->mem->start;
 
if (!pp->va_cfg0_base) {
-   pp->va_cfg0_base = devm_ioremap(pp->dev, pp->cfg0_base,
+   pp->va_cfg0_base = devm_ioremap(dev, pp->cfg0_base,
pp->cfg0_size);
if (!pp->va_cfg0_base) {
-   dev_err(pp->dev, "error with ioremap in function\n");
+   dev_err(dev, "error with ioremap in function\n");
ret = -ENOMEM;
goto error;
}
}
 
if (!pp->va_cfg1_base) {
-   pp->va_cfg1_base = devm_ioremap(pp->dev, pp->cfg1_base,
+   pp->va_cfg1_base = devm_ioremap(dev, pp->cfg1_base,
pp->cfg1_size);
if (!pp->va_cfg1_base) {
-   dev_err(pp->dev, "error with ioremap\n");
+   dev_err(dev, "error with ioremap\n");
ret = -ENOMEM;
goto error;
}
@@ -552,11 +553,11 @@ int dw_pcie_host_init(struct pcie_port *pp)
 
if (IS_ENABLED(CONFIG_PCI_MSI)) {
if (!pp->ops->msi_host_init) {
-   pp->irq_domain = irq_domain_add_linear(pp->dev->of_node,
+   pp->irq_domain = irq_domain_add_linear(dev->of_node,
MAX_MSI_IRQS, _domain_ops,
_pcie_msi_chip);
if (!pp->irq_domain) {
-   dev_err(pp->dev, "irq domain init failed\n");
+   dev_err(dev, "irq domain init failed\n");
ret = -ENXIO;
goto error;
}
@@ -575,12 +576,12 @@ int dw_pcie_host_init(struct pcie_port *pp)
 
pp->root_bus_nr = pp->busn->start;
if (IS_ENABLED(CONFIG_PCI_MSI)) {
-   bus = pci_scan_root_bus_msi(pp->dev,

[PATCH 17/37] Documentation: PCI: Guide to use pci endpoint configfs

2017-01-12 Thread Kishon Vijay Abraham I

Add Documentation to help users use pci endpoint to configure
pci endpoint function and to bind the endpoint function
with endpoint controller.

Signed-off-by: Kishon Vijay Abraham I 
---
 Documentation/PCI/00-INDEX  |2 +
 Documentation/PCI/endpoint/pci-endpoint-cfs.txt |   84 +++
 2 files changed, 86 insertions(+)
 create mode 100644 Documentation/PCI/endpoint/pci-endpoint-cfs.txt

diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
index ba950b2..f84a23c 100644
--- a/Documentation/PCI/00-INDEX
+++ b/Documentation/PCI/00-INDEX
@@ -14,3 +14,5 @@ pcieaer-howto.txt
- the PCI Express Advanced Error Reporting Driver Guide HOWTO
 endpoint/pci-endpoint.txt
- guide to add endpoint controller driver and endpoint function driver.
+endpoint/pci-endpoint-cfs.txt
+   - guide to use configfs to configure the pci endpoint function.
diff --git a/Documentation/PCI/endpoint/pci-endpoint-cfs.txt 
b/Documentation/PCI/endpoint/pci-endpoint-cfs.txt
new file mode 100644
index 000..b1f1613
--- /dev/null
+++ b/Documentation/PCI/endpoint/pci-endpoint-cfs.txt
@@ -0,0 +1,84 @@
+   CONFIGURING PCI ENDPOINT USING CONFIGFS
+Kishon Vijay Abraham I 
+
+The PCI Endpoint Core exposes configfs entry (pci_ep) in order to configure the
+PCI endpoint function and in order to bind the endpoint function
+with the endpoint controller. (For introducing other mechanisms to
+configure the PCI Endpoint Function refer [1]).
+
+*) Mounting configfs
+
+The PCI Endpoint Core layer creates pci_ep directory in the mounted configfs
+directory. configfs can be mounted using the following command.
+
+   mount -t configfs none /sys/kernel/config
+
+*) Directory Structure
+
+The pci_ep configfs directory structure has been created to reflect the
+natural tree like structure of PCI devices. So every directory created
+inside pci_ep represents a EPC device and every directory created inside
+epf directory represents EPF device.
+
+/sys/kernel/config/pci_ep/
+| / --> [2]
+   | epc
+   | epf/
+| / --> [3]
+   | vendorid
+   | deviceid
+   | revid
+   | progif_code
+   | subclass_code
+   | baseclass_code
+   | cache_line_size
+   | subsys_vendor_id
+   | subsys_id
+   | interrupt_pin
+   | function
+
+*) Creating configfs entry for EPC
+
+Any directory created inside *pci_ep* represents an EPC device. In the above
+directory structure [2] represents an EPC device. It consists of
+
+   *) epc: Use it to associate the configfs entry to an actual EPC device.
+   The list of valid entries for this field can be obtained from
+   ls /sys/class/pci_epc/
+
+   *) epf: Directory that contains all the endpoint functions. The name
+   of the created directory determines the driver this particular
+   epf device will be bound to. The name can be obtained either
+   from the function binding documentation [4] or
+   ls /sys/bus/pci-epf/drivers
+
+   If more than one endpoint function device has to be bound to
+   the same driver, then the directory should be created using
+   the following notation
+   mkdir .
+
+*) Creating configfs entry for EPF
+
+Any directory created inside *epf* directory represents an EPF device. In the
+above directory structure, [3] represents an EPF device. It consists of the
+following entries that can be used to configure the standard configuration
+header of the endpoint function. (These entries are created by the
+framework when any new directory is created inside epf directory.)
+
+| vendorid
+| deviceid
+| revid
+| progif_code
+| subclass_code
+| baseclass_code
+| cache_line_size
+| subsys_vendor_id
+| subsys_id
+| interrupt_pin
+
+The following entry identifies the function driver that is bound to the
+function device
+   | function
+
+[1] -> Documentation/PCI/endpoint/pci-endpoint.txt
+[4] -> Documentation/PCI/endpoint/function/binding/
-- 
1.7.9.5

[PATCH 18/37] Documentation: PCI: Add specification for the pci test function device

2017-01-12 Thread Kishon Vijay Abraham I

Add specification for the *pci test* virtual function device. The endpoint
function driver and the host pci driver should be created based on this
specification.

Signed-off-by: Kishon Vijay Abraham I 
---
 Documentation/PCI/00-INDEX   |2 +
 Documentation/PCI/endpoint/pci-test-function.txt |   66 ++
 2 files changed, 68 insertions(+)
 create mode 100644 Documentation/PCI/endpoint/pci-test-function.txt

diff --git a/Documentation/PCI/00-INDEX b/Documentation/PCI/00-INDEX
index f84a23c..4e5a283 100644
--- a/Documentation/PCI/00-INDEX
+++ b/Documentation/PCI/00-INDEX
@@ -16,3 +16,5 @@ endpoint/pci-endpoint.txt
- guide to add endpoint controller driver and endpoint function driver.
 endpoint/pci-endpoint-cfs.txt
- guide to use configfs to configure the pci endpoint function.
+endpoint/pci-test-function.txt
+   - specification of *pci test* function device.
diff --git a/Documentation/PCI/endpoint/pci-test-function.txt 
b/Documentation/PCI/endpoint/pci-test-function.txt
new file mode 100644
index 000..1324376
--- /dev/null
+++ b/Documentation/PCI/endpoint/pci-test-function.txt
@@ -0,0 +1,66 @@
+   PCI TEST
+   Kishon Vijay Abraham I 
+
+Traditionally PCI RC has always been validated by using standard
+PCI cards like ethernet PCI cards or USB PCI cards or SATA PCI cards.
+However with the addition of EP-core in linux kernel, it is possible
+to configure a PCI controller that can operate in EP mode to work as
+a test device.
+
+The PCI endpoint test device is a virtual device (defined in software)
+used to test the endpoint functionality and serve as a sample driver
+for other PCI endpoint devices (to use the EP framework).
+
+The PCI endpoint test device has the following registers:
+
+   1) PCI_ENDPOINT_TEST_MAGIC
+   2) PCI_ENDPOINT_TEST_COMMAND
+   3) PCI_ENDPOINT_TEST_STATUS
+   4) PCI_ENDPOINT_TEST_SRC_ADDR
+   5) PCI_ENDPOINT_TEST_DST_ADDR
+   6) PCI_ENDPOINT_TEST_SIZE
+   7) PCI_ENDPOINT_TEST_CHECKSUM
+
+*) PCI_ENDPOINT_TEST_MAGIC
+
+This register will be used to test BAR0. A known pattern will be written
+and read back from MAGIC register to verify BAR0.
+
+*) PCI_ENDPOINT_TEST_COMMAND:
+
+This register will be used by the host driver to indicate the function
+that the endpoint device must perform.
+
+Bitfield Description:
+  Bit 0: raise legacy irq
+  Bit 1: raise MSI irq
+  Bit 2 - 7: MSI interrupt number
+  Bit 8: read command (read data from RC buffer)
+  Bit 9: write command (write data to RC buffer)
+  Bit 10   : copy command (copy data from one RC buffer to another
+ RC buffer)
+
+*) PCI_ENDPOINT_TEST_STATUS
+
+This register reflects the status of the PCI endpoint device.
+
+Bitfield Description:
+  Bit 0: read success
+  Bit 1: read fail
+  Bit 2: write success
+  Bit 3: write fail
+  Bit 4: copy success
+  Bit 5: copy fail
+  Bit 6: irq raised
+  Bit 7: source address is invalid
+  Bit 8: destination address is invalid
+
+*) PCI_ENDPOINT_TEST_SRC_ADDR
+
+This register contains the source address (RC buffer address) for the
+COPY/READ command.
+
+*) PCI_ENDPOINT_TEST_DST_ADDR
+
+This register contains the destination address (RC buffer address) for
+the COPY/WRITE command.
-- 
1.7.9.5

[PATCH 29/37] PCI: dwc: dra7xx: Workaround for errata id i870

2017-01-12 Thread Kishon Vijay Abraham I

According to errata i870, access to the PCIe slave port
that are not 32-bit aligned will result in incorrect mapping
to TLP Address and Byte enable fields.

Accessing non 32-bit aligned data causes incorrect data in the target
buffer if memcpy is used. Implement the workaround for this
errata here.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pci-dra7xx.c |   50 ++
 1 file changed, 50 insertions(+)

diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index 333aa56..7666e3e 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -26,6 +26,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "pcie-designware.h"
 
@@ -531,6 +533,48 @@ static int dra7xx_pcie_enable_phy(struct dra7xx_pcie 
*dra7xx)
{},
 };
 
+/*
+ * dra7xx_pcie_ep_legacy_mode: workaround for AM572x/AM571x Errata i870
+ * @dra7xx: the dra7xx device where the workaround should be applied
+ *
+ * Access to the PCIe slave port that are not 32-bit aligned will result
+ * in incorrect mapping to TLP Address and Byte enable fields. Therefore,
+ * byte and half-word accesses are not possible to byte offset 0x1, 0x2, or
+ * 0x3.
+ *
+ * To avoid this issue set PCIE_SS1_AXI2OCP_LEGACY_MODE_ENABLE to 1.
+ */
+static int dra7xx_pcie_ep_legacy_mode(struct device *dev)
+{
+   int ret;
+   struct device_node *np = dev->of_node;
+   struct regmap *regmap;
+   unsigned int reg;
+   unsigned int field;
+
+   regmap = syscon_regmap_lookup_by_phandle(np, "syscon-legacy-mode");
+   if (IS_ERR(regmap)) {
+   dev_dbg(dev, "can't get syscon-legacy-mode\n");
+   return -EINVAL;
+   }
+
+   if (of_property_read_u32_index(np, "syscon-legacy-mode", 1, )) {
+   dev_err(dev, "couldn't get legacy mode register offset\n");
+   return -EINVAL;
+   }
+
+   if (of_property_read_u32_index(np, "syscon-legacy-mode", 2, )) {
+   dev_err(dev, "can't get bit field for setting legacy mode\n");
+   return -EINVAL;
+   }
+
+   ret = regmap_update_bits(regmap, reg, field, field);
+   if (ret)
+   dev_err(dev, "failed to set legacy mode\n");
+
+   return ret;
+}
+
 static int __init dra7xx_pcie_probe(struct platform_device *pdev)
 {
u32 reg;
@@ -643,6 +687,7 @@ static int __init dra7xx_pcie_probe(struct platform_device 
*pdev)
case DW_PCIE_RC_TYPE:
dra7xx_pcie_writel(dra7xx, PCIECTRL_TI_CONF_DEVICE_TYPE,
   DEVICE_TYPE_RC);
+
ret = dra7xx_add_pcie_port(dra7xx, pdev);
if (ret < 0)
goto err_gpio;
@@ -650,6 +695,11 @@ static int __init dra7xx_pcie_probe(struct platform_device 
*pdev)
case DW_PCIE_EP_TYPE:
dra7xx_pcie_writel(dra7xx, PCIECTRL_TI_CONF_DEVICE_TYPE,
   DEVICE_TYPE_EP);
+
+   ret = dra7xx_pcie_ep_legacy_mode(dev);
+   if (ret)
+   goto err_gpio;
+
ret = dra7xx_add_pcie_ep(dra7xx, pdev);
if (ret < 0)
goto err_gpio;
-- 
1.7.9.5

[PATCH 09/37] PCI: dwc: designware: Parse num-lanes property in dw_pcie_setup_rc

2017-01-12 Thread Kishon Vijay Abraham I

*num-lanes* dt property is parsed in dw_pcie_host_init. However
*num-lanes* property is applicable to both root complex mode and
endpoint mode. As a first step, move the parsing of this property
outside dw_pcie_host_init. This is in preparation for splitting
pcie-designware.c to pcie-designware.c and pcie-designware-host.c

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pcie-designware.c |   18 +++---
 drivers/pci/dwc/pcie-designware.h |1 -
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/dwc/pcie-designware.c 
b/drivers/pci/dwc/pcie-designware.c
index 00a0fdc..89cdb6b 100644
--- a/drivers/pci/dwc/pcie-designware.c
+++ b/drivers/pci/dwc/pcie-designware.c
@@ -551,10 +551,6 @@ int dw_pcie_host_init(struct pcie_port *pp)
}
}
 
-   ret = of_property_read_u32(np, "num-lanes", >lanes);
-   if (ret)
-   pci->lanes = 0;
-
ret = of_property_read_u32(np, "num-viewport", >num_viewport);
if (ret)
pci->num_viewport = 2;
@@ -751,18 +747,26 @@ static int dw_pcie_wr_conf(struct pci_bus *bus, u32 devfn,
 
 void dw_pcie_setup_rc(struct pcie_port *pp)
 {
+   int ret;
+   u32 lanes;
u32 val;
struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
+   struct device *dev = pci->dev;
+   struct device_node *np = dev->of_node;
 
/* get iATU unroll support */
pci->iatu_unroll_enabled = dw_pcie_iatu_unroll_enabled(pci);
dev_dbg(pci->dev, "iATU unroll: %s\n",
pci->iatu_unroll_enabled ? "enabled" : "disabled");
 
+   ret = of_property_read_u32(np, "num-lanes", );
+   if (ret)
+   lanes = 0;
+
/* set the number of lanes */
val = dw_pcie_readl_dbi(pci, PCIE_PORT_LINK_CONTROL);
val &= ~PORT_LINK_MODE_MASK;
-   switch (pci->lanes) {
+   switch (lanes) {
case 1:
val |= PORT_LINK_MODE_1_LANES;
break;
@@ -776,7 +780,7 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
val |= PORT_LINK_MODE_8_LANES;
break;
default:
-   dev_err(pci->dev, "num-lanes %u: invalid value\n", pci->lanes);
+   dev_err(pci->dev, "num-lanes %u: invalid value\n", lanes);
return;
}
dw_pcie_writel_dbi(pci, PCIE_PORT_LINK_CONTROL, val);
@@ -784,7 +788,7 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
/* set link width speed control register */
val = dw_pcie_readl_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL);
val &= ~PORT_LOGIC_LINK_WIDTH_MASK;
-   switch (pci->lanes) {
+   switch (lanes) {
case 1:
val |= PORT_LOGIC_LINK_WIDTH_1_LANES;
break;
diff --git a/drivers/pci/dwc/pcie-designware.h 
b/drivers/pci/dwc/pcie-designware.h
index d4b3d43..491fbe3 100644
--- a/drivers/pci/dwc/pcie-designware.h
+++ b/drivers/pci/dwc/pcie-designware.h
@@ -148,7 +148,6 @@ struct dw_pcie_ops {
 struct dw_pcie {
struct device   *dev;
void __iomem*dbi_base;
-   u32 lanes;
u32 num_viewport;
u8  iatu_unroll_enabled;
struct pcie_portpp;
-- 
1.7.9.5

[PATCH 25/37] dt-bindings: PCI: Add dt bindings for pci designware EP mode

2017-01-12 Thread Kishon Vijay Abraham I

Add device tree binding documentation for pci designware EP mode.

Signed-off-by: Kishon Vijay Abraham I 
---
 .../devicetree/bindings/pci/designware-pcie.txt|   26 ++--
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/Documentation/devicetree/bindings/pci/designware-pcie.txt 
b/Documentation/devicetree/bindings/pci/designware-pcie.txt
index 1392c70..b2480dd 100644
--- a/Documentation/devicetree/bindings/pci/designware-pcie.txt
+++ b/Documentation/devicetree/bindings/pci/designware-pcie.txt
@@ -6,30 +6,40 @@ Required properties:
 - reg-names: Must be "config" for the PCIe configuration space.
 (The old way of getting the configuration address space from "ranges"
 is deprecated and should be avoided.)
+- num-lanes: number of lanes to use
+RC mode:
 - #address-cells: set to <3>
 - #size-cells: set to <2>
 - device_type: set to "pci"
 - ranges: ranges for the PCI memory and I/O regions
 - #interrupt-cells: set to <1>
-- interrupt-map-mask and interrupt-map: standard PCI properties
-   to define the mapping of the PCIe interface to interrupt
+- interrupt-map-mask and interrupt-map: standard PCI
+   properties to define the mapping of the PCIe interface to interrupt
numbers.
-- num-lanes: number of lanes to use
+EP mode:
+- num-ib-windows: number of inbound address translation
+windows
+- num-ob-windows: number of outbound address translation
+windows
 
 Optional properties:
-- num-viewport: number of view ports configured in hardware.  If a platform
-  does not specify it, the driver assumes 2.
 - num-lanes: number of lanes to use (this property should be specified unless
   the link is brought already up in BIOS)
 - reset-gpio: gpio pin number of power good signal
-- bus-range: PCI bus numbers covered (it is recommended for new devicetrees to
-  specify this property, to keep backwards compatibility a range of 0x00-0xff
-  is assumed if not present)
 - clocks: Must contain an entry for each entry in clock-names.
See ../clocks/clock-bindings.txt for details.
 - clock-names: Must include the following entries:
- "pcie"
- "pcie_bus"
+RC mode:
+- num-viewport: number of view ports configured in
+  hardware. If a platform does not specify it, the driver assumes 2.
+- bus-range: PCI bus numbers covered (it is recommended
+  for new devicetrees to specify this property, to keep backwards
+  compatibility a range of 0x00-0xff is assumed if not present)
+EP mode:
+- max-functions: maximum number of functions that can be
+  configured
 
 Example configuration:
 
-- 
1.7.9.5

[PATCH 02/37] PCI: dwc: designware: Add new ops for cpu addr fixup

2017-01-12 Thread Kishon Vijay Abraham I

Some platforms (like dra7xx) require only the least 28 bits of the
corresponding 32 bit CPU address to be programmed in the address
translation unit. This modified address is stored in io_base/mem_base/
cfg0_base/cfg1_base in dra7xx_pcie_host_init. While this is okay for
host mode where the address range is fixed, device mode requires
different addresses to be programmed based on the host buffer address.
Add a new ops to get the least 28 bits of the corresponding 32 bit
CPU address and invoke it before programming the address translation
unit.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pcie-designware.c |3 +++
 drivers/pci/dwc/pcie-designware.h |1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/pci/dwc/pcie-designware.c 
b/drivers/pci/dwc/pcie-designware.c
index bed1999..d68bc7b 100644
--- a/drivers/pci/dwc/pcie-designware.c
+++ b/drivers/pci/dwc/pcie-designware.c
@@ -195,6 +195,9 @@ static void dw_pcie_prog_outbound_atu(struct pcie_port *pp, 
int index,
 {
u32 retries, val;
 
+   if (pp->ops->cpu_addr_fixup)
+   cpu_addr = pp->ops->cpu_addr_fixup(cpu_addr);
+
if (pp->iatu_unroll_enabled) {
dw_pcie_writel_unroll(pp, index, PCIE_ATU_UNR_LOWER_BASE,
lower_32_bits(cpu_addr));
diff --git a/drivers/pci/dwc/pcie-designware.h 
b/drivers/pci/dwc/pcie-designware.h
index a567ea2..32f4602 100644
--- a/drivers/pci/dwc/pcie-designware.h
+++ b/drivers/pci/dwc/pcie-designware.h
@@ -54,6 +54,7 @@ struct pcie_port {
 };
 
 struct pcie_host_ops {
+   u64 (*cpu_addr_fixup)(u64 cpu_addr);
u32 (*readl_rc)(struct pcie_port *pp, u32 reg);
void (*writel_rc)(struct pcie_port *pp, u32 reg, u32 val);
int (*rd_own_conf)(struct pcie_port *pp, int where, int size, u32 *val);
-- 
1.7.9.5

[PATCH 31/37] misc: Add host side pci driver for pci test function device

2017-01-12 Thread Kishon Vijay Abraham I

Add PCI endpoint test driver that can verify base address
register, legacy interrupt/MSI interrupt and read/write/copy
buffers between host and device. The corresponding pci-epf-test
function driver should be used on the EP side.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/misc/Kconfig |7 +
 drivers/misc/Makefile|1 +
 drivers/misc/pci_endpoint_test.c |  533 ++
 include/uapi/linux/Kbuild|1 +
 include/uapi/linux/pcitest.h |   19 ++
 5 files changed, 561 insertions(+)
 create mode 100644 drivers/misc/pci_endpoint_test.c
 create mode 100644 include/uapi/linux/pcitest.h

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 64971ba..14a95a6 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -766,6 +766,13 @@ config PANEL_BOOT_MESSAGE
  An empty message will only clear the display at driver init time. Any 
other
  printf()-formatted message is valid with newline and escape codes.
 
+config PCI_ENDPOINT_TEST
+   depends on PCI || COMPILE_TEST
+   tristate "PCI Endpoint Test driver"
+   ---help---
+   Enable this configuration option to enable the host side test driver
+   for PCI Endpoint.
+
 source "drivers/misc/c2port/Kconfig"
 source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 3198336..64a532ac2 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -53,6 +53,7 @@ obj-$(CONFIG_ECHO)+= echo/
 obj-$(CONFIG_VEXPRESS_SYSCFG)  += vexpress-syscfg.o
 obj-$(CONFIG_CXL_BASE) += cxl/
 obj-$(CONFIG_PANEL) += panel.o
+obj-$(CONFIG_PCI_ENDPOINT_TEST)+= pci_endpoint_test.o
 
 lkdtm-$(CONFIG_LKDTM)  += lkdtm_core.o
 lkdtm-$(CONFIG_LKDTM)  += lkdtm_bugs.o
diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
new file mode 100644
index 000..920b14c
--- /dev/null
+++ b/drivers/misc/pci_endpoint_test.c
@@ -0,0 +1,533 @@
+/**
+ * Host side test driver to test endpoint functionality
+ *
+ * Copyright (C) 2017 Texas Instruments
+ * Author: Kishon Vijay Abraham I 
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 of
+ * the License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include 
+
+#define DRV_MODULE_NAME"pci-endpoint-test"
+
+#define PCI_ENDPOINT_TEST_MAGIC0x0
+
+#define PCI_ENDPOINT_TEST_COMMAND  0x4
+#define COMMAND_RAISE_LEGACY_IRQ   BIT(0)
+#define COMMAND_RAISE_MSI_IRQ  BIT(1)
+#define MSI_NUMBER_SHIFT   2
+/* 6 bits for MSI number */
+#define COMMAND_READBIT(8)
+#define COMMAND_WRITE   BIT(9)
+#define COMMAND_COPYBIT(10)
+
+#define PCI_ENDPOINT_TEST_STATUS   0x8
+#define STATUS_READ_SUCCESS BIT(0)
+#define STATUS_READ_FAILBIT(1)
+#define STATUS_WRITE_SUCCESSBIT(2)
+#define STATUS_WRITE_FAIL   BIT(3)
+#define STATUS_COPY_SUCCESS BIT(4)
+#define STATUS_COPY_FAILBIT(5)
+#define STATUS_IRQ_RAISED   BIT(6)
+#define STATUS_SRC_ADDR_INVALID BIT(7)
+#define STATUS_DST_ADDR_INVALID BIT(8)
+
+#define PCI_ENDPOINT_TEST_LOWER_SRC_ADDR   0xc
+#define PCI_ENDPOINT_TEST_UPPER_SRC_ADDR   0x10
+
+#define PCI_ENDPOINT_TEST_LOWER_DST_ADDR   0x14
+#define PCI_ENDPOINT_TEST_UPPER_DST_ADDR   0x18
+
+#define PCI_ENDPOINT_TEST_SIZE 0x1c
+#define PCI_ENDPOINT_TEST_CHECKSUM 0x20
+
+static DEFINE_IDA(pci_endpoint_test_ida);
+
+#define to_endpoint_test(priv) container_of((priv), struct pci_endpoint_test, \
+   miscdev)
+enum pci_barno {
+   BAR_0,
+   BAR_1,
+   BAR_2,
+   BAR_3,
+   BAR_4,
+   BAR_5,
+};
+
+struct pci_endpoint_test {
+   struct pci_dev  *pdev;
+   void __iomem*base;
+   void __iomem*bar[6];
+   struct completion irq_raised;
+   int last_irq;
+   /* mutex to protect the ioctls */
+   struct mutexmutex;
+   struct miscdevice miscdev;
+};
+
+static int bar_size[] = { 4, 512, 1024, 16384, 131072, 1048576 };
+
+static inline u32

[PATCH 34/37] tools: PCI: Add sample test script to invoke pcitest

2017-01-12 Thread Kishon Vijay Abraham I

Add a simple test script that invokes the pcitest userspace tool
to perform all the PCI endpoint tests (BAR tests, interrupt tests,
read tests, write tests and copy tests).

Signed-off-by: Kishon Vijay Abraham I 
---
 tools/pci/pcitest.sh |   56 ++
 1 file changed, 56 insertions(+)
 create mode 100644 tools/pci/pcitest.sh

diff --git a/tools/pci/pcitest.sh b/tools/pci/pcitest.sh
new file mode 100644
index 000..5442bbe
--- /dev/null
+++ b/tools/pci/pcitest.sh
@@ -0,0 +1,56 @@
+#!/bin/sh
+
+echo "BAR tests"
+echo
+
+bar=0
+
+while [ $bar -lt 6 ]
+do
+   pcitest -b $bar
+   bar=`expr $bar + 1`
+done
+echo
+
+echo "Interrupt tests"
+echo
+
+pcitest -l
+msi=1
+
+while [ $msi -lt 33 ]
+do
+pcitest -m $msi
+msi=`expr $msi + 1`
+done
+echo
+
+echo "Read Tests"
+echo
+
+pcitest -r -s 1
+pcitest -r -s 1024
+pcitest -r -s 1025
+pcitest -r -s 1024000
+pcitest -r -s 1024001
+echo
+
+echo "Write Tests"
+echo
+
+pcitest -w -s 1
+pcitest -w -s 1024
+pcitest -w -s 1025
+pcitest -w -s 1024000
+pcitest -w -s 1024001
+echo
+
+echo "Copy Tests"
+echo
+
+pcitest -c -s 1
+pcitest -c -s 1024
+pcitest -c -s 1025
+pcitest -c -s 1024000
+pcitest -c -s 1024001
+echo
-- 
1.7.9.5

[PATCH 37/37] ARM: dts: DRA7: Add pcie1 dt node for EP mode

2017-01-12 Thread Kishon Vijay Abraham I

Add pcie1 dt node in order for the controller to operate in
endpoint mode. However since none of the dra7 based boards have
slots configured to operate in endpoint mode, keep EP mode
disabled.

Signed-off-by: Kishon Vijay Abraham I 
---
 arch/arm/boot/dts/am572x-idk.dts|7 ++-
 arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi |7 ++-
 arch/arm/boot/dts/dra7-evm.dts  |4 
 arch/arm/boot/dts/dra7.dtsi |   22 +-
 arch/arm/boot/dts/dra72-evm-common.dtsi |4 
 5 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/arch/arm/boot/dts/am572x-idk.dts b/arch/arm/boot/dts/am572x-idk.dts
index 1540f7a..2ca2839 100644
--- a/arch/arm/boot/dts/am572x-idk.dts
+++ b/arch/arm/boot/dts/am572x-idk.dts
@@ -88,6 +88,11 @@
load-gpios = < 19 GPIO_ACTIVE_LOW>;
 };
 
- {
+_rc {
+   status = "okay";
+   gpios = < 23 GPIO_ACTIVE_HIGH>;
+};
+
+_ep {
gpios = < 23 GPIO_ACTIVE_HIGH>;
 };
diff --git a/arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi 
b/arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi
index 78bee26..079a7e1 100644
--- a/arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi
+++ b/arch/arm/boot/dts/am57xx-beagle-x15-common.dtsi
@@ -556,7 +556,12 @@
};
 };
 
- {
+_rc {
+   status = "ok";
+   gpios = < 8 GPIO_ACTIVE_LOW>;
+};
+
+_ep {
gpios = < 8 GPIO_ACTIVE_LOW>;
 };
 
diff --git a/arch/arm/boot/dts/dra7-evm.dts b/arch/arm/boot/dts/dra7-evm.dts
index 132f2be..fd0aa3a 100644
--- a/arch/arm/boot/dts/dra7-evm.dts
+++ b/arch/arm/boot/dts/dra7-evm.dts
@@ -937,3 +937,7 @@
status = "okay";
};
 };
+
+_rc {
+   status = "okay";
+};
diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi
index addb753..bf9c668 100644
--- a/arch/arm/boot/dts/dra7.dtsi
+++ b/arch/arm/boot/dts/dra7.dtsi
@@ -272,7 +272,11 @@
#address-cells = <1>;
ranges = <0x5100 0x5100 0x3000
  0x00x2000 0x1000>;
-   pcie1: pcie@5100 {
+   /**
+* To enable PCI endpoint mode, disable the pcie1_rc
+* node and enable pcie1_ep mode.
+*/
+   pcie1_rc: pcie@5100 {
compatible = "ti,dra7-pcie";
reg = <0x5100 0x2000>, <0x51002000 0x14c>, 
<0x1000 0x2000>;
reg-names = "rc_dbics", "ti_conf", "config";
@@ -293,12 +297,28 @@
<0 0 0 2 _intc 2>,
<0 0 0 3 _intc 3>,
<0 0 0 4 _intc 4>;
+   status = "disabled";
pcie1_intc: interrupt-controller {
interrupt-controller;
#address-cells = <0>;
#interrupt-cells = <1>;
};
};
+
+   pcie1_ep: pcie_ep@5100 {
+   compatible = "ti,dra7-pcie-ep";
+   reg = <0x5100 0x28>, <0x51002000 0x14c>, 
<0x51001000 0x28>, <0x1000 0x1000>;
+   reg-names = "ep_dbics", "ti_conf", "ep_dbics2", 
"addr_space";
+   interrupts = <0 232 0x4>;
+   num-lanes = <1>;
+   num-ib-windows = <4>;
+   num-ob-windows = <16>;
+   ti,hwmods = "pcie1";
+   phys = <_phy>;
+   phy-names = "pcie-phy0";
+   syscon-legacy-mode = <_conf1 0x14 2>;
+   status = "disabled";
+   };
};
 
axi@1 {
diff --git a/arch/arm/boot/dts/dra72-evm-common.dtsi 
b/arch/arm/boot/dts/dra72-evm-common.dtsi
index e50fbee..5d9762c 100644
--- a/arch/arm/boot/dts/dra72-evm-common.dtsi
+++ b/arch/arm/boot/dts/dra72-evm-common.dtsi
@@ -545,3 +545,7 @@
status = "okay";
};
 };
+
+_rc {
+   status = "okay";
+};
-- 
1.7.9.5

[PATCH 00/37] PCI: Support for configurable PCI endpoint

2017-01-12 Thread Kishon Vijay Abraham I

The RFC series that was sent before this patch series can be found at [1].
The patches are split here so that it can be better reviewed.

This main purpose of this patch series is to
 *) add PCI endpoint core layer
 *) modifie designware/dra7xx driver to be configured in EP mode
 *) add a PCI endpoint *test* function driver and corresponding host
driver

Major Improvements from RFC:
 *) support multi-function devices (hw supported not virtual)
 *) Access host side buffers
 *) Raise MSI interrupts
 *) Add user space program to use the host side PCI driver
 *) Adapt all other users of designware to use the new design (only
compile tested. Since I have only dra7xx boards, the new design
has only been tested in dra7xx. I'd require the help of others
to test the platforms they have access to).

This patch series has been developed on top of 4.10-rc1, [2] & [3]

[1] -> https://lwn.net/Articles/700605/
[2] -> https://lkml.org/lkml/2016/12/28/34
[3] -> https://lkml.org/lkml/2017/1/11/238

I've also pushed the tree to
git://git.ti.com/linux-phy/linux-phy.git pci_ep_v1

Using PCI EPF Test:
ON THE EP SIDE:
***
/* EP function is configured using configfs */
# mount -t configfs none /sys/kernel/config

/* PCI EP core layer creates "pci_ep" entry in configfs */
# cd /sys/kernel/config/pci_ep/

/*
 * This is the 1st step in creating an endpoint function. This
 * creates the endpoint device.
 */
# mkdir dev

/*
 * dev has 2 entries. *epc* for binding a EPC device and *epf*
 * is a directory containing all the functions of the endpoint
 */
# ls dev
epc  epf

/*
 * This creates the endpoint function device *instance*. The string
 * before the . suffix will identify the driver this
 * EP function will bind to.
 * Just pci_epf_test is also valid. The . suffix is used
 * if there are multiple PCI controllers and all of them wants
 * to use the same function.
 */
# mkdir dev/epf/pci_epf_test.0

/*
 * When the above command is given, the function device will
 * also be bound to a function driver. To find the list of
 * function drivers available in the system, use the following
 * command. To create a new driver, the following can be referred
 * drivers/pci/endpoint/functions/pci-epf-test.c
 */
# ls /sys/bus/pci-epf/drivers
pci_epf_test

/* Now configure the endpoint function */
/* These are the fields that can be configured */
# ls dev/epf/pci_epf_test.0/
baseclass_codefunction  progif_code   subsys_id
cache_line_size   interrupt_pin revid subsys_vendor_id
deviceid  msi_interruptssubclass_code vendorid

/* The function driver will populate these fields with default values */
# cat dev/epf/pci_epf_test.0/vendorid 
0x

# cat dev/epf/pci_epf_test.0/interrupt_pin
0x0001

/* The user can configure any of these fields */
# echo 0x104c > dev/epf/pci_epf_test.0/vendorid
# echo 16 > dev/epf/pci_epf_test.0/msi_interrupts

/*
 * Next is binding this function driver to the controller driver. In
 * order to find the possible controller drivers that this function
 * driver can be bound to, the following sysfs entry can be used
 */
# ls /sys/class/pci_epc/
5100.pci

/* Now bind the function driver to the controller driver */
# echo "5100.pcie_ep" > epc
[  494.743487] dra7-pcie 5100.pcie: no free inbound window
[  494.749367] pci_epf_test pci_epf_test.0: failed to set BAR4
[  494.755238] dra7-pcie 5100.pcie: no free inbound window
[  494.761451] pci_epf_test pci_epf_test.0: failed to set BAR5

/*
 * the above error messages are due to non availability of free
 * inbound windows. So the function drivers in dra7xx can use
 * only 4 (BAR0..BAR3) BARs
 */

/** PCI endpoint is configured **/

ON THE HOST SIDE:
*
# ./pcitest.sh 
BAR tests

BAR0:   OKAY
BAR1:   OKAY
BAR2:   OKAY
BAR3:   OKAY
BAR4:   NOT OKAY
BAR5:   NOT OKAY

Interrupt tests

LEGACY IRQ: NOT OKAY
MSI1:   OKAY
MSI2:   OKAY
MSI3:   OKAY
MSI4:   OKAY
MSI5:   OKAY
MSI6:   OKAY
MSI7:   OKAY
MSI8:   OKAY
MSI9:   OKAY
MSI10:  OKAY
MSI11:  OKAY
MSI12:  OKAY
MSI13:  OKAY
MSI14:  OKAY
MSI15:  OKAY
MSI16:  OKAY
MSI17:  NOT OKAY
MSI18:  NOT OKAY
MSI19:  NOT OKAY
MSI20:  NOT OKAY
MSI21:  NOT OKAY
MSI22:  NOT OKAY
MSI23:  NOT OKAY
MSI24:  NOT OKAY
MSI25:  NOT OKAY
MSI26:  NOT OKAY
MSI27:  NOT OKAY
MSI28:  NOT OKAY
MSI29:  NOT OKAY
MSI30:  NOT OKAY
MSI31:  NOT OKAY
MSI32:  NOT OKAY

Read Tests

READ (  1 bytes):   OKAY
READ (   1024 bytes):   OKAY
READ (   1025 bytes):   OKAY
READ (1024000 bytes):   OKAY
READ (1024001 bytes):   OKAY

Write Tests

WRITE (  1 bytes):  OKAY
WRITE (   1024 bytes):  OKAY
WRITE (   1025

[PATCH 27/37] PCI: dwc: dra7xx: Add EP mode support

2017-01-12 Thread Kishon Vijay Abraham I

The PCIe controller integrated in dra7xx SoCs is capable of operating
in endpoint mode. Add endpoint mode support to dra7xx driver.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/Kconfig   |   31 +-
 drivers/pci/dwc/Makefile  |4 +-
 drivers/pci/dwc/pci-dra7xx.c  |  197 ++---
 drivers/pci/dwc/pcie-designware.h |7 ++
 4 files changed, 221 insertions(+), 18 deletions(-)

diff --git a/drivers/pci/dwc/Kconfig b/drivers/pci/dwc/Kconfig
index 4cb1ba0..7932be6 100644
--- a/drivers/pci/dwc/Kconfig
+++ b/drivers/pci/dwc/Kconfig
@@ -16,14 +16,37 @@ config PCIE_DW_EP
 
 config PCI_DRA7XX
bool "TI DRA7xx PCIe controller"
-   depends on PCI
+   depends on (PCI && PCI_MSI_IRQ_DOMAIN) || PCI_ENDPOINT
depends on OF && HAS_IOMEM && TI_PIPE3
+   help
+Enables support for the PCIe controller in the DRA7xx SoC. There
+are two instances of PCIe controller in DRA7xx. This controller can
+work either as EP or RC. In order to enable host specific features
+PCI_DRA7XX_HOST must be selected and in order to enable device
+specific features PCI_DRA7XX_EP must be selected. This uses
+the Designware core.
+
+if PCI_DRA7XX
+
+config PCI_DRA7XX_HOST
+   bool "PCI DRA7xx Host Mode"
+   depends on PCI
depends on PCI_MSI_IRQ_DOMAIN
select PCIE_DW_HOST
+   default y
help
-Enables support for the PCIe controller in the DRA7xx SoC.  There
-are two instances of PCIe controller in DRA7xx.  This controller can
-act both as EP and RC.  This reuses the Designware core.
+Enables support for the PCIe controller in the DRA7xx SoC to work in
+host mode.
+
+config PCI_DRA7XX_EP
+   bool "PCI DRA7xx Endpoint Mode"
+   depends on PCI_ENDPOINT
+   select PCIE_DW_EP
+   help
+Enables support for the PCIe controller in the DRA7xx SoC to work in
+endpoint mode.
+
+endif
 
 config PCIE_DW_PLAT
bool "Platform bus based DesignWare PCIe Controller"
diff --git a/drivers/pci/dwc/Makefile b/drivers/pci/dwc/Makefile
index b38425d..f31a859 100644
--- a/drivers/pci/dwc/Makefile
+++ b/drivers/pci/dwc/Makefile
@@ -2,7 +2,9 @@ obj-$(CONFIG_PCIE_DW) += pcie-designware.o
 obj-$(CONFIG_PCIE_DW_HOST) += pcie-designware-host.o
 obj-$(CONFIG_PCIE_DW_EP) += pcie-designware-ep.o
 obj-$(CONFIG_PCIE_DW_PLAT) += pcie-designware-plat.o
-obj-$(CONFIG_PCI_DRA7XX) += pci-dra7xx.o
+ifneq ($(filter y,$(CONFIG_PCI_DRA7XX_HOST) $(CONFIG_PCI_DRA7XX_EP)),)
+obj-$(CONFIG_PCI_DRA7XX) += pci-dra7xx.o
+endif
 obj-$(CONFIG_PCI_EXYNOS) += pci-exynos.o
 obj-$(CONFIG_PCI_IMX6) += pci-imx6.o
 obj-$(CONFIG_PCIE_SPEAR13XX) += pcie-spear13xx.o
diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index eb3a9c6..333aa56 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -10,12 +10,14 @@
  * published by the Free Software Foundation.
  */
 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -57,6 +59,11 @@
 #defineMSI BIT(4)
 #defineLEG_EP_INTERRUPTS (INTA | INTB | INTC | INTD)
 
+#definePCIECTRL_TI_CONF_DEVICE_TYPE0x0100
+#defineDEVICE_TYPE_EP  0x0
+#defineDEVICE_TYPE_LEG_EP  0x1
+#defineDEVICE_TYPE_RC  0x4
+
 #definePCIECTRL_DRA7XX_CONF_DEVICE_CMD 0x0104
 #defineLTSSM_EN0x1
 
@@ -66,6 +73,13 @@
 
 #define EXP_CAP_ID_OFFSET  0x70
 
+#definePCIECTRL_TI_CONF_INTX_ASSERT0x0124
+#definePCIECTRL_TI_CONF_INTX_DEASSERT  0x0128
+
+#definePCIECTRL_TI_CONF_MSI_XMT0x012c
+#define MSI_REQ_GRANT  BIT(0)
+#define MSI_VECTOR_SHIFT   7
+
 struct dra7xx_pcie {
struct dw_pcie  *pci;
void __iomem*base;  /* DT ti_conf */
@@ -73,6 +87,11 @@ struct dra7xx_pcie {
struct phy  **phy;
int link_gen;
struct irq_domain   *irq_domain;
+   enum dw_pcie_device_mode mode;
+};
+
+struct dra7xx_pcie_of_data {
+   enum dw_pcie_device_mode mode;
 };
 
 #define to_dra7xx_pcie(x)  dev_get_drvdata((x)->dev)
@@ -101,9 +120,19 @@ static int dra7xx_pcie_link_up(struct dw_pcie *pci)
return !!(reg & LINK_UP);
 }
 
-static int dra7xx_pcie_establish_link(struct dra7xx_pcie *dra7xx)
+static void dra7xx_pcie_stop_link(struct dw_pcie *pci)
 {
-   struct dw_pcie *pci = dra7xx->pci;
+   struct dra7xx_pcie *dra7xx = to_dra7xx_pcie(pci);
+   u32 reg;
+
+   reg =

[PATCH 33/37] tools: PCI: Add a userspace tool to test PCI endpoint

2017-01-12 Thread Kishon Vijay Abraham I

Add a userspace tool to invoke the ioctls exposed by the
PCI endpoint test driver to perform various PCI tests.

Signed-off-by: Kishon Vijay Abraham I 
---
 tools/pci/pcitest.c |  186 +++
 1 file changed, 186 insertions(+)
 create mode 100644 tools/pci/pcitest.c

diff --git a/tools/pci/pcitest.c b/tools/pci/pcitest.c
new file mode 100644
index 000..39b5b0b
--- /dev/null
+++ b/tools/pci/pcitest.c
@@ -0,0 +1,186 @@
+/**
+ * Userspace PCI Endpoint Test Module
+ *
+ * Copyright (C) 2017 Texas Instruments
+ * Author: Kishon Vijay Abraham I 
+ *
+ * This program is free software: you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 of
+ * the License as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define BILLION 1E9
+
+static char *result[] = { "NOT OKAY", "OKAY" };
+
+struct pci_test {
+   char*device;
+   charbarnum;
+   boollegacyirq;
+   unsigned intmsinum;
+   boolread;
+   boolwrite;
+   boolcopy;
+   unsigned long   size;
+};
+
+static int run_test(struct pci_test *test)
+{
+   long ret;
+   int fd;
+   struct timespec start, end;
+   double time;
+
+   fd = open(test->device, O_RDWR);
+   if (fd < 0) {
+   perror("can't open PCI Endpoint Test device");
+   return fd;
+   }
+
+   if (test->barnum >= 0 && test->barnum <= 5) {
+   ret = ioctl(fd, PCITEST_BAR, test->barnum);
+   fprintf(stdout, "BAR%d:\t\t", test->barnum);
+   if (ret < 0)
+   fprintf(stdout, "TEST FAILED\n");
+   else
+   fprintf(stdout, "%s\n", result[ret]);
+   }
+
+   if (test->legacyirq) {
+   ret = ioctl(fd, PCITEST_LEGACY_IRQ, 0);
+   fprintf(stdout, "LEGACY IRQ:\t");
+   if (ret < 0)
+   fprintf(stdout, "TEST FAILED\n");
+   else
+   fprintf(stdout, "%s\n", result[ret]);
+   }
+
+   if (test->msinum > 0 && test->msinum <= 32) {
+   ret = ioctl(fd, PCITEST_MSI, test->msinum);
+   fprintf(stdout, "MSI%d:\t\t", test->msinum);
+   if (ret < 0)
+   fprintf(stdout, "TEST FAILED\n");
+   else
+   fprintf(stdout, "%s\n", result[ret]);
+   }
+
+   if (test->write) {
+   ret = ioctl(fd, PCITEST_WRITE, test->size);
+   fprintf(stdout, "WRITE (%7ld bytes):\t\t", test->size);
+   if (ret < 0)
+   fprintf(stdout, "TEST FAILED\n");
+   else
+   fprintf(stdout, "%s\n", result[ret]);
+   }
+
+   if (test->read) {
+   ret = ioctl(fd, PCITEST_READ, test->size);
+   fprintf(stdout, "READ (%7ld bytes):\t\t", test->size);
+   if (ret < 0)
+   fprintf(stdout, "TEST FAILED\n");
+   else
+   fprintf(stdout, "%s\n", result[ret]);
+   }
+
+   if (test->copy) {
+   ret = ioctl(fd, PCITEST_COPY, test->size);
+   fprintf(stdout, "COPY (%7ld bytes):\t\t", test->size);
+   if (ret < 0)
+   fprintf(stdout, "TEST FAILED\n");
+   else
+   fprintf(stdout, "%s\n", result[ret]);
+   }
+
+   fflush(stdout);
+}
+
+int main(int argc, char **argv)
+{
+   int c;
+   struct pci_test *test;
+
+   test = calloc(1, sizeof(*test));
+   if (!test) {
+   perror("Fail to allocate memory for pci_test\n");
+   return -ENOMEM;
+   }
+
+   /* since '0' is a valid BAR number, initialize it to -1 */
+   test->barnum = -1;
+
+   /* set default size as 100KB */
+   test->size = 0x19000;
+
+   /* set default endpoint device */
+   test->device = "/dev/pci-endpoint-test.0";
+
+   while ((c = getopt(argc, argv, "D:b:m:lrwcs:")) != EOF)
+   switch (c) {
+   case 'D':
+   test->device = optarg;
+   continue;
+   case 'b':
+   test->barnum = atoi(optarg);
+   if (test->barnum < 0 || test->barnum > 5)
+   goto usage;
+   continue;
+   case 'l':
+

[PATCH 13/37] PCI: dwc: Remove dependency of designware to CONFIG_PCI

2017-01-12 Thread Kishon Vijay Abraham I

CONFIG_PCI is used to enable the host mode PCI. In preparation for adding
endpoint mode support to designware driver, remove the dependency of
designware to CONFIG_PCI and make only the host specific part depend on
CONFIG_PCI.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/Makefile|3 +++
 drivers/pci/Makefile|3 ---
 drivers/pci/dwc/Kconfig |   13 -
 3 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/Makefile b/drivers/Makefile
index 060026a..f521cb0 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -15,6 +15,9 @@ obj-$(CONFIG_PINCTRL) += pinctrl/
 obj-$(CONFIG_GPIOLIB)  += gpio/
 obj-y  += pwm/
 obj-$(CONFIG_PCI)  += pci/
+# PCI dwc controller drivers
+obj-y  += pci/dwc/
+
 obj-$(CONFIG_PARISC)   += parisc/
 obj-$(CONFIG_RAPIDIO)  += rapidio/
 obj-y  += video/
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index b7e9751..8db5079 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -66,8 +66,5 @@ obj-$(CONFIG_OF) += of.o
 
 ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG
 
-# PCI dwc controller drivers
-obj-y += dwc/
-
 # PCI host controller drivers
 obj-y += host/
diff --git a/drivers/pci/dwc/Kconfig b/drivers/pci/dwc/Kconfig
index d0bdfb5..bee8b52 100644
--- a/drivers/pci/dwc/Kconfig
+++ b/drivers/pci/dwc/Kconfig
@@ -1,16 +1,17 @@
 menu "DesignWare PCI Core Support"
-   depends on PCI
 
 config PCIE_DW
bool
 
 config PCIE_DW_HOST
 bool
+   depends on PCI
depends on PCI_MSI_IRQ_DOMAIN
 select PCIE_DW
 
 config PCI_DRA7XX
bool "TI DRA7xx PCIe controller"
+   depends on PCI
depends on OF && HAS_IOMEM && TI_PIPE3
depends on PCI_MSI_IRQ_DOMAIN
select PCIE_DW_HOST
@@ -21,6 +22,7 @@ config PCI_DRA7XX
 
 config PCIE_DW_PLAT
bool "Platform bus based DesignWare PCIe Controller"
+   depends on PCI
depends on PCI_MSI_IRQ_DOMAIN
select PCIE_DW_HOST
---help---
@@ -33,6 +35,7 @@ config PCIE_DW_PLAT
 
 config PCI_EXYNOS
bool "Samsung Exynos PCIe controller"
+   depends on PCI
depends on SOC_EXYNOS5440 || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
@@ -40,6 +43,7 @@ config PCI_EXYNOS
 
 config PCI_IMX6
bool "Freescale i.MX6 PCIe controller"
+   depends on PCI
depends on SOC_IMX6Q || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
@@ -47,6 +51,7 @@ config PCI_IMX6
 
 config PCIE_SPEAR13XX
bool "STMicroelectronics SPEAr PCIe controller"
+   depends on PCI
depends on ARCH_SPEAR13XX || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
@@ -56,6 +61,7 @@ config PCIE_SPEAR13XX
 
 config PCI_KEYSTONE
bool "TI Keystone PCIe controller"
+   depends on PCI
depends on ARCH_KEYSTONE || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
@@ -68,6 +74,7 @@ config PCI_KEYSTONE
 
 config PCI_LAYERSCAPE
bool "Freescale Layerscape PCIe controller"
+   depends on PCI
depends on OF && (ARM || ARCH_LAYERSCAPE || COMPILE_TEST)
depends on PCI_MSI_IRQ_DOMAIN
select MFD_SYSCON
@@ -78,6 +85,7 @@ config PCI_LAYERSCAPE
 config PCI_HISI
depends on OF && ARM64
bool "HiSilicon Hip05 and Hip06 SoCs PCIe controllers"
+   depends on PCI
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
select PCIE_DW_HOST
@@ -87,6 +95,7 @@ config PCI_HISI
 
 config PCIE_QCOM
bool "Qualcomm PCIe controller"
+   depends on PCI
depends on (ARCH_QCOM || COMPILE_TEST) && OF
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
@@ -98,6 +107,7 @@ config PCIE_QCOM
 
 config PCIE_ARMADA_8K
bool "Marvell Armada-8K PCIe controller"
+   depends on PCI
depends on ARCH_MVEBU || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
@@ -110,6 +120,7 @@ config PCIE_ARMADA_8K
 
 config PCIE_ARTPEC6
bool "Axis ARTPEC-6 PCIe controller"
+   depends on PCI
depends on MACH_ARTPEC6 || COMPILE_TEST
depends on PCI_MSI_IRQ_DOMAIN
select PCIEPORTBUS
-- 
1.7.9.5

[PATCH 32/37] Documentation: misc-devices: Add Documentation for pci-endpoint-test driver

2017-01-12 Thread Kishon Vijay Abraham I

Add Documentation for pci-endpoint-test driver.

Signed-off-by: Kishon Vijay Abraham I 
---
 Documentation/misc-devices/pci-endpoint-test.txt |   35 ++
 1 file changed, 35 insertions(+)
 create mode 100644 Documentation/misc-devices/pci-endpoint-test.txt

diff --git a/Documentation/misc-devices/pci-endpoint-test.txt 
b/Documentation/misc-devices/pci-endpoint-test.txt
new file mode 100644
index 000..4385718
--- /dev/null
+++ b/Documentation/misc-devices/pci-endpoint-test.txt
@@ -0,0 +1,35 @@
+Driver for PCI Endpoint Test Function
+
+This driver should be used as a host side driver if the root complex is
+connected to a configurable pci endpoint running *pci_epf_test* function
+driver configured according to [1].
+
+The "pci_endpoint_test" driver can be used to perform the following tests.
+
+The PCI driver for the test device performs the following tests
+   *) verifying addresses programmed in BAR
+   *) raise legacy IRQ
+   *) raise MSI IRQ
+   *) read data
+   *) write data
+   *) copy data
+
+This misc driver creates /dev/pci-endpoint-test. for every
+*pci_epf_test* function connected to the root complex and "ioctls"
+should be used to perform the above tests.
+
+ioctl
+-
+ PCITEST_BAR: Tests the BAR. The number of the BAR that has to be tested
+ should be passed as argument.
+ PCITEST_LEGACY_IRQ: Tests legacy IRQ
+ PCITEST_MSI: Tests message signalled interrupts. The MSI number that has
+ to be tested should be passed as argument.
+ PCITEST_WRITE: Perform write tests. The size of the buffer should be passed
+   as argument.
+ PCITEST_READ: Perform read tests. The size of the buffer should be passed
+  as argument.
+ PCITEST_COPY: Perform read tests. The size of the buffer should be passed
+  as argument.
+
+[1] -> Documentation/PCI/endpoint/function/binding/pci-test.txt
-- 
1.7.9.5

[PATCH 03/37] PCI: dwc: dra7xx: Populate cpu_addr_fixup ops

2017-01-12 Thread Kishon Vijay Abraham I

Populate cpu_addr_fixup ops to extract the least 28 bits of the
corresponding cpu address.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pci-dra7xx.c |   11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index fb37e09..2073d46 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -88,6 +88,11 @@ static inline void dra7xx_pcie_writel(struct dra7xx_pcie 
*pcie, u32 offset,
writel(value, pcie->base + offset);
 }
 
+static u64 dra7xx_pcie_cpu_addr_fixup(u64 pci_addr)
+{
+   return pci_addr & DRA7XX_CPU_TO_BUS_ADDR;
+}
+
 static int dra7xx_pcie_link_up(struct pcie_port *pp)
 {
struct dra7xx_pcie *dra7xx = to_dra7xx_pcie(pp);
@@ -151,11 +156,6 @@ static void dra7xx_pcie_host_init(struct pcie_port *pp)
 {
struct dra7xx_pcie *dra7xx = to_dra7xx_pcie(pp);
 
-   pp->io_base &= DRA7XX_CPU_TO_BUS_ADDR;
-   pp->mem_base &= DRA7XX_CPU_TO_BUS_ADDR;
-   pp->cfg0_base &= DRA7XX_CPU_TO_BUS_ADDR;
-   pp->cfg1_base &= DRA7XX_CPU_TO_BUS_ADDR;
-
dw_pcie_setup_rc(pp);
 
dra7xx_pcie_establish_link(dra7xx);
@@ -164,6 +164,7 @@ static void dra7xx_pcie_host_init(struct pcie_port *pp)
 }
 
 static struct pcie_host_ops dra7xx_pcie_host_ops = {
+   .cpu_addr_fixup = dra7xx_pcie_cpu_addr_fixup,
.link_up = dra7xx_pcie_link_up,
.host_init = dra7xx_pcie_host_init,
 };
-- 
1.7.9.5

[PATCH 23/37] PCI: dwc: Add ops to start and stop pcie link

2017-01-12 Thread Kishon Vijay Abraham I

Add start_link and stop_link ops in dw_pcie_ops to start or stop
the link. This will be used by endpoint functions to start the
link once the setup has been done.

Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pcie-designware.h |2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/dwc/pcie-designware.h 
b/drivers/pci/dwc/pcie-designware.h
index 0ef6ae7..25b3b8b 100644
--- a/drivers/pci/dwc/pcie-designware.h
+++ b/drivers/pci/dwc/pcie-designware.h
@@ -149,6 +149,8 @@ struct dw_pcie_ops {
void(*write_dbi)(struct dw_pcie *pcie, void __iomem *base, u32 reg,
 int size, u32 val);
int (*link_up)(struct dw_pcie *pcie);
+   int (*start_link)(struct dw_pcie *pcie);
+   void(*stop_link)(struct dw_pcie *pcie);
 };
 
 struct dw_pcie {
-- 
1.7.9.5

[PATCH 05/37] PCI: dwc: Add platform_set_drvdata

2017-01-12 Thread Kishon Vijay Abraham I

Add platform_set_drvdata in all designware based drivers to store the
private data structure of the driver so that dev_set_drvdata can be
used to get back private data pointer in add_pcie_port/host_init.
This is in preparation for splitting struct pcie_port into core and
host only structures. After the split pcie_port will not be part of
the driver's private data structure and *container_of* used now
to get the private data pointer cannot be used.

Cc: Jingoo Han 
Cc: Richard Zhu 
Cc: Lucas Stach 
Cc: Murali Karicheri 
Cc: Minghuan Lian 
Cc: Mingkai Hu 
Cc: Roy Zang 
Cc: Thomas Petazzoni 
Cc: Niklas Cassel 
Cc: Jesper Nilsson 
Cc: Joao Pinto 
Cc: Zhou Wang 
Cc: Gabriele Paoloni 
Cc: Stanimir Varbanov 
Cc: Pratyush Anand 
Signed-off-by: Kishon Vijay Abraham I 
---
 drivers/pci/dwc/pci-dra7xx.c   |3 ++-
 drivers/pci/dwc/pci-exynos.c   |3 ++-
 drivers/pci/dwc/pci-imx6.c |3 ++-
 drivers/pci/dwc/pci-keystone.c |2 ++
 drivers/pci/dwc/pci-layerscape.c   |2 ++
 drivers/pci/dwc/pcie-armada8k.c|2 ++
 drivers/pci/dwc/pcie-artpec6.c |2 ++
 drivers/pci/dwc/pcie-designware-plat.c |2 ++
 drivers/pci/dwc/pcie-hisi.c|2 ++
 drivers/pci/dwc/pcie-qcom.c|2 ++
 drivers/pci/dwc/pcie-spear13xx.c   |3 ++-
 11 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/dwc/pci-dra7xx.c b/drivers/pci/dwc/pci-dra7xx.c
index 2073d46..aeeab74 100644
--- a/drivers/pci/dwc/pci-dra7xx.c
+++ b/drivers/pci/dwc/pci-dra7xx.c
@@ -433,6 +433,8 @@ static int __init dra7xx_pcie_probe(struct platform_device 
*pdev)
return ret;
}
 
+   platform_set_drvdata(pdev, dra7xx);
+
pm_runtime_enable(dev);
ret = pm_runtime_get_sync(dev);
if (ret < 0) {
@@ -459,7 +461,6 @@ static int __init dra7xx_pcie_probe(struct platform_device 
*pdev)
if (ret < 0)
goto err_gpio;
 
-   platform_set_drvdata(pdev, dra7xx);
return 0;
 
 err_gpio:
diff --git a/drivers/pci/dwc/pci-exynos.c b/drivers/pci/dwc/pci-exynos.c
index f1c544b..c179e7a 100644
--- a/drivers/pci/dwc/pci-exynos.c
+++ b/drivers/pci/dwc/pci-exynos.c
@@ -583,11 +583,12 @@ static int __init exynos_pcie_probe(struct 
platform_device *pdev)
goto fail_bus_clk;
}
 
+   platform_set_drvdata(pdev, exynos_pcie);
+
ret = exynos_add_pcie_port(exynos_pcie, pdev);
if (ret < 0)
goto fail_bus_clk;
 
-   platform_set_drvdata(pdev, exynos_pcie);
return 0;
 
 fail_bus_clk:
diff --git a/drivers/pci/dwc/pci-imx6.c b/drivers/pci/dwc/pci-imx6.c
index c8cefb0..6e5d06f 100644
--- a/drivers/pci/dwc/pci-imx6.c
+++ b/drivers/pci/dwc/pci-imx6.c
@@ -719,11 +719,12 @@ static int __init imx6_pcie_probe(struct platform_device 
*pdev)
if (ret)
imx6_pcie->link_gen = 1;
 
+   platform_set_drvdata(pdev, imx6_pcie);
+
ret = imx6_add_pcie_port(imx6_pcie, pdev);
if (ret < 0)
return ret;
 
-   platform_set_drvdata(pdev, imx6_pcie);
return 0;
 }
 
diff --git a/drivers/pci/dwc/pci-keystone.c b/drivers/pci/dwc/pci-keystone.c
index 043c19a..4c7ba35 100644
--- a/drivers/pci/dwc/pci-keystone.c
+++ b/drivers/pci/dwc/pci-keystone.c
@@ -422,6 +422,8 @@ static int __init ks_pcie_probe(struct platform_device 
*pdev)
if (ret)
return ret;
 
+   platform_set_drvdata(pdev, ks_pcie);
+
ret = ks_add_pcie_port(ks_pcie, pdev);
if (ret < 0)
goto fail_clk;
diff --git a/drivers/pci/dwc/pci-layerscape.c b/drivers/pci/dwc/pci-layerscape.c
index ea78913..89e8817 100644
--- a/drivers/pci/dwc/pci-layerscape.c
+++ b/drivers/pci/dwc/pci-layerscape.c
@@ -268,6 +268,8 @@ static int __init ls_pcie_probe(struct platform_device 
*pdev)
if (!ls_pcie_is_bridge(pcie))
return -ENODEV;
 
+   platform_set_drvdata(pdev, pcie);
+
ret = ls_add_pcie_port(pcie);
if (ret < 0)
return ret;
diff --git a/drivers/pci/dwc/pcie-armada8k.c b/drivers/pci/dwc/pcie-armada8k.c
index 0ac0f18..5a28dcb 100644
--- a/drivers/pci/dwc/pcie-armada8k.c
+++ b/drivers/pci/dwc/pcie-armada8k.c
@@ -226,6 +226,8 @@ static int armada8k_pcie_probe(struct platform_device *pdev)
goto fail;
}
 
+   platform_set_drvdata(pdev, pcie);
+
ret = armada8k_add_pcie_port(pcie, pdev);
if (ret)
goto fail;
diff --git a/drivers/pci/dwc/pcie-artpec6.c b/drivers/pci/dwc/pcie-artpec6.c
index

Re: [PATCH 2/2] powerpc/64: Add BPF_JIT to powernv and pseries defconfigs

2017-01-12 Thread Naveen N. Rao

On 2017/01/12 09:17PM, Anton Blanchard wrote:
> From: Anton Blanchard 
> 
> Commit db9112173b18 ("powerpc: Turn on BPF_JIT in ppc64_defconfig")
> only added BPF_JIT to the ppc64 defconfig. Add it to our powernv
> and pseries defconfigs too.
> 
> Signed-off-by: Anton Blanchard 

Thanks!
Acked-by: Naveen N. Rao 


> ---
>  arch/powerpc/configs/powernv_defconfig | 1 +
>  arch/powerpc/configs/pseries_defconfig | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/configs/powernv_defconfig 
> b/arch/powerpc/configs/powernv_defconfig
> index e4d53fe..b793550 100644
> --- a/arch/powerpc/configs/powernv_defconfig
> +++ b/arch/powerpc/configs/powernv_defconfig
> @@ -79,6 +79,7 @@ CONFIG_NETFILTER=y
>  # CONFIG_NETFILTER_ADVANCED is not set
>  CONFIG_BRIDGE=m
>  CONFIG_VLAN_8021Q=m
> +CONFIG_BPF_JIT=y
>  CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
>  CONFIG_DEVTMPFS=y
>  CONFIG_DEVTMPFS_MOUNT=y
> diff --git a/arch/powerpc/configs/pseries_defconfig 
> b/arch/powerpc/configs/pseries_defconfig
> index 5a06bdd..d99734f 100644
> --- a/arch/powerpc/configs/pseries_defconfig
> +++ b/arch/powerpc/configs/pseries_defconfig
> @@ -82,6 +82,7 @@ CONFIG_NETFILTER=y
>  # CONFIG_NETFILTER_ADVANCED is not set
>  CONFIG_BRIDGE=m
>  CONFIG_VLAN_8021Q=m
> +CONFIG_BPF_JIT=y
>  CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
>  CONFIG_DEVTMPFS=y
>  CONFIG_DEVTMPFS_MOUNT=y
> -- 
> 2.9.3
>

[PATCH 2/2] powerpc/64: Add BPF_JIT to powernv and pseries defconfigs

2017-01-12 Thread Anton Blanchard

From: Anton Blanchard 

Commit db9112173b18 ("powerpc: Turn on BPF_JIT in ppc64_defconfig")
only added BPF_JIT to the ppc64 defconfig. Add it to our powernv
and pseries defconfigs too.

Signed-off-by: Anton Blanchard 
---
 arch/powerpc/configs/powernv_defconfig | 1 +
 arch/powerpc/configs/pseries_defconfig | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/powerpc/configs/powernv_defconfig 
b/arch/powerpc/configs/powernv_defconfig
index e4d53fe..b793550 100644
--- a/arch/powerpc/configs/powernv_defconfig
+++ b/arch/powerpc/configs/powernv_defconfig
@@ -79,6 +79,7 @@ CONFIG_NETFILTER=y
 # CONFIG_NETFILTER_ADVANCED is not set
 CONFIG_BRIDGE=m
 CONFIG_VLAN_8021Q=m
+CONFIG_BPF_JIT=y
 CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
 CONFIG_DEVTMPFS=y
 CONFIG_DEVTMPFS_MOUNT=y
diff --git a/arch/powerpc/configs/pseries_defconfig 
b/arch/powerpc/configs/pseries_defconfig
index 5a06bdd..d99734f 100644
--- a/arch/powerpc/configs/pseries_defconfig
+++ b/arch/powerpc/configs/pseries_defconfig
@@ -82,6 +82,7 @@ CONFIG_NETFILTER=y
 # CONFIG_NETFILTER_ADVANCED is not set
 CONFIG_BRIDGE=m
 CONFIG_VLAN_8021Q=m
+CONFIG_BPF_JIT=y
 CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
 CONFIG_DEVTMPFS=y
 CONFIG_DEVTMPFS_MOUNT=y
-- 
2.9.3

[PATCH 1/2] powerpc/64: Move HAVE_CONTEXT_TRACKING from pseries to common Kconfig

2017-01-12 Thread Anton Blanchard

From: Anton Blanchard 

We added support for HAVE_CONTEXT_TRACKING, but placed the option inside
PPC_PSERIES.

This has the undesirable effect that NO_HZ_FULL can be enabled on a
kernel with both powernv and pseries support, but cannot on a kernel
with powernv only support.

Signed-off-by: Anton Blanchard 
---
 arch/powerpc/Kconfig   | 1 +
 arch/powerpc/platforms/pseries/Kconfig | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 48001e7..f072f82 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -164,6 +164,7 @@ config PPC
select HAVE_ARCH_HARDENED_USERCOPY
select HAVE_KERNEL_GZIP
select HAVE_CC_STACKPROTECTOR
+   select HAVE_CONTEXT_TRACKING if PPC64
 
 config GENERIC_CSUM
def_bool CPU_LITTLE_ENDIAN
diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index e1c280a..30ec04f 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -17,7 +17,6 @@ config PPC_PSERIES
select PPC_UDBG_16550
select PPC_NATIVE
select PPC_DOORBELL
-   select HAVE_CONTEXT_TRACKING
select HOTPLUG_CPU if SMP
select ARCH_RANDOM
select PPC_DOORBELL
-- 
2.9.3

Re: [PATCH v4 2/2] KVM: PPC: Exit guest upon MCE when FWNMI capability is enabled

2017-01-12 Thread Aravinda Prasad

On Thursday 12 January 2017 02:35 PM, Balbir Singh wrote:
> On Mon, Jan 09, 2017 at 05:10:45PM +0530, Aravinda Prasad wrote:

[ . . .]

>> The reasons for this approach is (i) it is not possible
>> to distinguish whether the exception occurred in the
>> guest or the host from the pt_regs passed on the
>> machine_check_exception(). Hence machine_check_exception()
>> calls panic, instead of passing on the exception to
>> the guest, if the machine check exception is not
>> recoverable. (ii) the approach introduced in this
>> patch gives opportunity to the host kernel to perform
>> actions in virtual mode before passing on the exception
>> to the guest. This approach does not require complex
>> tweaks to machine_check_fwnmi and friends.
> 
> It would be good to qualify the different types of MCE
> and what action we expect across hypervisor and guest.

The hypervisor performs actions depending on the type of MCE (SLB
multihit, UEs, etc). If the hypervisor is unable to recover from the MCE
and if the address in error belongs to the guest, then this patch set
forwards the error to the guest kernel for handling.

The main goal of this patch set is to pass on the unrecoverable MCE
errors in the guest address space to the guest kernel, instead of
crashing the hypervisor. The action taken by the hypervisor and the
guest kernel upon MCE remains unchanged.

[ . . . ]

> 
> Shouldn't the host take action for example poison bad pages?
> 

We want to give the guest kernel a chance to recover the clean part of
the page before poisoning. As in case of an UE only few bytes of a page
are affected. Hence we don't immediately poison the bad pages in the host.

It is expected that the guest kernel performs the poisoning of the bad
pages after performing recovery action. This prevents the guest from
reusing the bad page.

However, the missing part is to communicate back to the host when guest
is done with the recovery. This is mainly to prevent reuse of bad pages
by the host when the guest shutdowns/reboots/crashes/migrates.

We are planning to address this part as a separate patch set.

Regards,
Aravinda

>>  if (opal_recover_mce(regs, ))
>>  return 1;
>>  
>>
> 
> Balbir Singh 
> 

-- 
Regards,
Aravinda

Re: [PATCH v5 3/5] cpuidle:powernv: Add helper function to populate powernv idle states.

2017-01-12 Thread Balbir Singh

On Tue, Jan 10, 2017 at 02:37:02PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" 
> 
> In the current code for powernv_add_idle_states, there is a lot of code
> duplication while initializing an idle state in powernv_states table.
> 
> Add an inline helper function to populate the powernv_states[] table
> for a given idle state. Invoke this for populating the "Nap",
> "Fastsleep" and the stop states in powernv_add_idle_states.
> 
> Signed-off-by: Gautham R. Shenoy 
> -- 

Acked-by: Balbir Singh

Re: [PATCH v5 2/5] powernv:stop: Uniformly rename power9 to arch300

2017-01-12 Thread Balbir Singh

On Tue, Jan 10, 2017 at 02:37:01PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" 
> 
> Balbir pointed out that in idle_book3s.S and powernv/idle.c some
> functions and variables had power9 in their names while some others
> had arch300.
>

I would prefer power9 to arch300

Balbir Singh.

Re: [PATCH v5 1/5] powernv:idle: Add IDLE_STATE_ENTER_SEQ_NORET macro

2017-01-12 Thread Balbir Singh

On Tue, Jan 10, 2017 at 02:37:00PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" 
> 
> Currently all the low-power idle states are expected to wake up
> at reset vector 0x100. Which is why the macro IDLE_STATE_ENTER_SEQ
> that puts the CPU to an idle state and never returns.
> 
> On ISA v3.0, when the ESL and EC bits in the PSSCR are zero, the CPU
> is expected to wake up at the next instruction of the idle
> instruction.
> 
> This patch adds a new macro named IDLE_STATE_ENTER_SEQ_NORET for the
> no-return variant and reuses the name IDLE_STATE_ENTER_SEQ
> for a variant that allows resuming operation at the instruction next
> to the idle-instruction.
> 
> Signed-off-by: Gautham R. Shenoy 
> ---
> No changes from v4
>

Acked-by: Balbir Singh

[PATCH] powerpc/mm: fix a hardcode on memory boundary checking

2017-01-12 Thread Rui Teng

The offset of hugepage block will not be 16G, if the expected
page is more than one. Calculate the totol size instead of the
hardcode value.

Signed-off-by: Rui Teng 
---
 arch/powerpc/mm/hash_utils_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 8033493..b829f8e 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -506,7 +506,7 @@ static int __init htab_dt_scan_hugepage_blocks(unsigned 
long node,
printk(KERN_INFO "Huge page(16GB) memory: "
"addr = 0x%lX size = 0x%lX pages = %d\n",
phys_addr, block_size, expected_pages);
-   if (phys_addr + (16 * GB) <= memblock_end_of_DRAM()) {
+   if (phys_addr + block_size * expected_pages <= memblock_end_of_DRAM()) {
memblock_reserve(phys_addr, block_size * expected_pages);
add_gpage(phys_addr, block_size, expected_pages);
}
-- 
2.9.0

[PATCH 18/18] KVM: PPC: Book3S HV: Use ASDR for HPT guests on POWER9

2017-01-12 Thread Paul Mackerras

POWER9 adds a register called ASDR (Access Segment Descriptor
Register), which is set by hypervisor data/instruction storage
interrupts to contain the segment descriptor for the address
being accessed, assuming the guest is using HPT translation.
(For radix guests, it contains the guest real address of the
access.)

Thus, for HPT guests on POWER9, we can use this register rather
than looking up the SLB with the slbfee. instruction.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index f638f3e..625ba5e 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -1731,6 +1731,10 @@ kvmppc_hdsi:
/* HPTE not found fault or protection fault? */
andis.  r0, r6, (DSISR_NOHPTE | DSISR_PROTFAULT)@h
beq 1f  /* if not, send it to the guest */
+BEGIN_FTR_SECTION
+   mfspr   r5, SPRN_ASDR   /* on POWER9, use ASDR to get VSID */
+   b   4f
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
andi.   r0, r11, MSR_DR /* data relocation enabled? */
beq 3f
clrrdi  r0, r4, 28
@@ -1819,6 +1823,10 @@ kvmppc_hisi:
bne .Lradix_hisi/* for radix, just save ASDR */
andis.  r0, r11, SRR1_ISI_NOPT@h
beq 1f
+BEGIN_FTR_SECTION
+   mfspr   r5, SPRN_ASDR   /* on POWER9, use ASDR to get VSID */
+   b   4f
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_300)
andi.   r0, r11, MSR_IR /* instruction relocation enabled? */
beq 3f
clrrdi  r0, r10, 28
-- 
2.7.4

[PATCH 17/18] KVM: PPC: Book3S HV: Enable radix guest support

2017-01-12 Thread Paul Mackerras

This adds a few last pieces of the support for radix guests:

* Implement the backends for the KVM_PPC_CONFIGURE_V3_MMU and
  KVM_PPC_GET_RMMU_INFO ioctls for radix guests

* On POWER9, allow secondary threads to be on/off-lined while guests
  are running.

* Set up LPCR and the partition table entry for radix guests.

* Don't allocate the rmap array in the kvm_memory_slot structure
  on radix.

* Prevent the AIL field in the LPCR being set for radix guests,
  since we can't yet handle getting interrupts from the guest with
  the MMU on.

* Don't try to initialize the HPT for radix guests, since they don't
  have an HPT.

* Take out the code that prevents the HV KVM module from
  initializing on radix hosts.

At this stage, we only support radix guests if the host is running
in radix mode, and only support HPT guests if the host is running in
HPT mode.  Thus a guest cannot switch from one mode to the other,
which enables some simplifications.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_book3s.h  |  2 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c|  1 -
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 45 
 arch/powerpc/kvm/book3s_hv.c   | 93 --
 arch/powerpc/kvm/powerpc.c |  2 +-
 5 files changed, 115 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 57dc407..2bf3501 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -189,6 +189,7 @@ extern int kvmppc_book3s_radix_page_fault(struct kvm_run 
*run,
unsigned long ea, unsigned long dsisr);
 extern int kvmppc_mmu_radix_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
struct kvmppc_pte *gpte, bool data, bool iswrite);
+extern int kvmppc_init_vm_radix(struct kvm *kvm);
 extern void kvmppc_free_radix(struct kvm *kvm);
 extern int kvmppc_radix_init(void);
 extern void kvmppc_radix_exit(void);
@@ -200,6 +201,7 @@ extern int kvm_test_age_radix(struct kvm *kvm, struct 
kvm_memory_slot *memslot,
unsigned long gfn);
 extern long kvmppc_hv_get_dirty_log_radix(struct kvm *kvm,
struct kvm_memory_slot *memslot, unsigned long *map);
+extern int kvmhv_get_rmmu_info(struct kvm *kvm, struct kvm_ppc_rmmu_info 
*info);
 
 /* XXX remove this export when load_last_inst() is generic */
 extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, 
bool data);
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 7a9afbe..db8de17 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -155,7 +155,6 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, u32 
*htab_orderp)
 
 void kvmppc_free_hpt(struct kvm *kvm)
 {
-   kvmppc_free_lpid(kvm->arch.lpid);
vfree(kvm->arch.revmap);
if (kvm->arch.hpt_cma_alloc)
kvm_release_hpt(virt_to_page(kvm->arch.hpt_virt),
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c 
b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 125cc7c..4344651 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -610,6 +610,51 @@ long kvmppc_hv_get_dirty_log_radix(struct kvm *kvm,
return 0;
 }
 
+static void add_rmmu_ap_encoding(struct kvm_ppc_rmmu_info *info,
+int psize, int *indexp)
+{
+   if (!mmu_psize_defs[psize].shift)
+   return;
+   info->ap_encodings[*indexp] = mmu_psize_defs[psize].shift |
+   (mmu_psize_defs[psize].ap << 29);
+   ++(*indexp);
+}
+
+int kvmhv_get_rmmu_info(struct kvm *kvm, struct kvm_ppc_rmmu_info *info)
+{
+   int i;
+
+   if (!radix_enabled())
+   return -EINVAL;
+   memset(info, 0, sizeof(*info));
+
+   /* 4k page size */
+   info->geometries[0].page_shift = 12;
+   info->geometries[0].level_bits[0] = 9;
+   for (i = 1; i < 4; ++i)
+   info->geometries[0].level_bits[i] = p9_supported_radix_bits[i];
+   /* 64k page size */
+   info->geometries[1].page_shift = 16;
+   for (i = 0; i < 4; ++i)
+   info->geometries[1].level_bits[i] = p9_supported_radix_bits[i];
+
+   i = 0;
+   add_rmmu_ap_encoding(info, MMU_PAGE_4K, );
+   add_rmmu_ap_encoding(info, MMU_PAGE_64K, );
+   add_rmmu_ap_encoding(info, MMU_PAGE_2M, );
+   add_rmmu_ap_encoding(info, MMU_PAGE_1G, );
+
+   return 0;
+}
+
+int kvmppc_init_vm_radix(struct kvm *kvm)
+{
+   kvm->arch.pgtable = pgd_alloc(kvm->mm);
+   if (!kvm->arch.pgtable)
+   return -ENOMEM;
+   return 0;
+}
+
 void kvmppc_free_radix(struct kvm *kvm)
 {
unsigned long ig, iu, im;
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ab5adcd..14a9efe 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++

[PATCH 16/18] KVM: PPC: Book3S HV: Make HPT-specific hypercalls return error in radix mode

2017-01-12 Thread Paul Mackerras

If the guest is in radix mode, then it doesn't have a hashed page
table (HPT), so all of the hypercalls that manipulate the HPT can't
work and should return an error.  This adds checks to make them
return H_FUNCTION ("function not supported").

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index 9ef3c4b..6c1ac3d 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -182,6 +182,8 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long flags,
unsigned long mmu_seq;
unsigned long rcbits, irq_flags = 0;
 
+   if (kvm_is_radix(kvm))
+   return H_FUNCTION;
psize = hpte_page_size(pteh, ptel);
if (!psize)
return H_PARAMETER;
@@ -458,6 +460,8 @@ long kvmppc_do_h_remove(struct kvm *kvm, unsigned long 
flags,
struct revmap_entry *rev;
u64 pte, orig_pte, pte_r;
 
+   if (kvm_is_radix(kvm))
+   return H_FUNCTION;
if (pte_index >= kvm->arch.hpt_npte)
return H_PARAMETER;
hpte = (__be64 *)(kvm->arch.hpt_virt + (pte_index << 4));
@@ -529,6 +533,8 @@ long kvmppc_h_bulk_remove(struct kvm_vcpu *vcpu)
struct revmap_entry *rev, *revs[4];
u64 hp0, hp1;
 
+   if (kvm_is_radix(kvm))
+   return H_FUNCTION;
global = global_invalidates(kvm, 0);
for (i = 0; i < 4 && ret == H_SUCCESS; ) {
n = 0;
@@ -642,6 +648,8 @@ long kvmppc_h_protect(struct kvm_vcpu *vcpu, unsigned long 
flags,
unsigned long v, r, rb, mask, bits;
u64 pte_v, pte_r;
 
+   if (kvm_is_radix(kvm))
+   return H_FUNCTION;
if (pte_index >= kvm->arch.hpt_npte)
return H_PARAMETER;
 
@@ -711,6 +719,8 @@ long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long 
flags,
int i, n = 1;
struct revmap_entry *rev = NULL;
 
+   if (kvm_is_radix(kvm))
+   return H_FUNCTION;
if (pte_index >= kvm->arch.hpt_npte)
return H_PARAMETER;
if (flags & H_READ_4) {
@@ -750,6 +760,8 @@ long kvmppc_h_clear_ref(struct kvm_vcpu *vcpu, unsigned 
long flags,
unsigned long *rmap;
long ret = H_NOT_FOUND;
 
+   if (kvm_is_radix(kvm))
+   return H_FUNCTION;
if (pte_index >= kvm->arch.hpt_npte)
return H_PARAMETER;
 
@@ -796,6 +808,8 @@ long kvmppc_h_clear_mod(struct kvm_vcpu *vcpu, unsigned 
long flags,
unsigned long *rmap;
long ret = H_NOT_FOUND;
 
+   if (kvm_is_radix(kvm))
+   return H_FUNCTION;
if (pte_index >= kvm->arch.hpt_npte)
return H_PARAMETER;
 
-- 
2.7.4

[PATCH 15/18] KVM: PPC: Book3S HV: Implement dirty page logging for radix guests

2017-01-12 Thread Paul Mackerras

This adds code to keep track of dirty pages when requested (that is,
when memslot->dirty_bitmap is non-NULL) for radix guests.  We use the
dirty bits in the PTEs in the second-level (partition-scoped) page
tables, together with a bitmap of pages that were dirty when their
PTE was invalidated (e.g., when the page was paged out).  This bitmap
is stored in the first half of the memslot->dirty_bitmap area, and
kvm_vm_ioctl_get_dirty_log_hv() now uses the second half for the
bitmap that gets returned to userspace.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_book3s.h  |   7 ++-
 arch/powerpc/kvm/book3s_64_mmu_hv.c|  28 -
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 111 ++---
 arch/powerpc/kvm/book3s_hv.c   |  31 +++--
 4 files changed, 144 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 952cc4b..57dc407 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -198,6 +198,8 @@ extern int kvm_age_radix(struct kvm *kvm, struct 
kvm_memory_slot *memslot,
unsigned long gfn);
 extern int kvm_test_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot,
unsigned long gfn);
+extern long kvmppc_hv_get_dirty_log_radix(struct kvm *kvm,
+   struct kvm_memory_slot *memslot, unsigned long *map);
 
 /* XXX remove this export when load_last_inst() is generic */
 extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, 
bool data);
@@ -228,8 +230,11 @@ extern long kvmppc_do_h_enter(struct kvm *kvm, unsigned 
long flags,
 extern long kvmppc_do_h_remove(struct kvm *kvm, unsigned long flags,
unsigned long pte_index, unsigned long avpn,
unsigned long *hpret);
-extern long kvmppc_hv_get_dirty_log(struct kvm *kvm,
+extern long kvmppc_hv_get_dirty_log_hpt(struct kvm *kvm,
struct kvm_memory_slot *memslot, unsigned long *map);
+extern void kvmppc_harvest_vpa_dirty(struct kvmppc_vpa *vpa,
+   struct kvm_memory_slot *memslot,
+   unsigned long *map);
 extern void kvmppc_update_lpcr(struct kvm *kvm, unsigned long lpcr,
unsigned long mask);
 extern void kvmppc_set_fscr(struct kvm_vcpu *vcpu, u64 fscr);
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index fbb3de4..7a9afbe 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -1068,7 +1068,7 @@ static int kvm_test_clear_dirty_npages(struct kvm *kvm, 
unsigned long *rmapp)
return npages_dirty;
 }
 
-static void harvest_vpa_dirty(struct kvmppc_vpa *vpa,
+void kvmppc_harvest_vpa_dirty(struct kvmppc_vpa *vpa,
  struct kvm_memory_slot *memslot,
  unsigned long *map)
 {
@@ -1086,12 +1086,11 @@ static void harvest_vpa_dirty(struct kvmppc_vpa *vpa,
__set_bit_le(gfn - memslot->base_gfn, map);
 }
 
-long kvmppc_hv_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot,
-unsigned long *map)
+long kvmppc_hv_get_dirty_log_hpt(struct kvm *kvm,
+   struct kvm_memory_slot *memslot, unsigned long *map)
 {
unsigned long i, j;
unsigned long *rmapp;
-   struct kvm_vcpu *vcpu;
 
preempt_disable();
rmapp = memslot->arch.rmap;
@@ -1107,15 +1106,6 @@ long kvmppc_hv_get_dirty_log(struct kvm *kvm, struct 
kvm_memory_slot *memslot,
__set_bit_le(j, map);
++rmapp;
}
-
-   /* Harvest dirty bits from VPA and DTL updates */
-   /* Note: we never modify the SLB shadow buffer areas */
-   kvm_for_each_vcpu(i, vcpu, kvm) {
-   spin_lock(>arch.vpa_update_lock);
-   harvest_vpa_dirty(>arch.vpa, memslot, map);
-   harvest_vpa_dirty(>arch.dtl, memslot, map);
-   spin_unlock(>arch.vpa_update_lock);
-   }
preempt_enable();
return 0;
 }
@@ -1170,10 +1160,14 @@ void kvmppc_unpin_guest_page(struct kvm *kvm, void *va, 
unsigned long gpa,
srcu_idx = srcu_read_lock(>srcu);
memslot = gfn_to_memslot(kvm, gfn);
if (memslot) {
-   rmap = >arch.rmap[gfn - memslot->base_gfn];
-   lock_rmap(rmap);
-   *rmap |= KVMPPC_RMAP_CHANGED;
-   unlock_rmap(rmap);
+   if (!kvm_is_radix(kvm)) {
+   rmap = >arch.rmap[gfn - memslot->base_gfn];
+   lock_rmap(rmap);
+   *rmap |= KVMPPC_RMAP_CHANGED;
+   unlock_rmap(rmap);
+   } else if (memslot->dirty_bitmap) {
+   mark_page_dirty(kvm, gfn);
+   }
}
srcu_read_unlock(>srcu,

[PATCH 14/18] KVM: PPC: Book3S HV: MMU notifier callbacks for radix guests

2017-01-12 Thread Paul Mackerras

This adapts our implementations of the MMU notifier callbacks
(unmap_hva, unmap_hva_range, age_hva, test_age_hva, set_spte_hva)
to call radix functions when the guest is using radix.  These
implementations are much simpler than for HPT guests because we
have only one PTE to deal with, so we don't need to traverse
rmap chains.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_book3s.h  |  6 
 arch/powerpc/kvm/book3s_64_mmu_hv.c| 64 +++---
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 54 
 3 files changed, 103 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index ff5cd5c..952cc4b 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -192,6 +192,12 @@ extern int kvmppc_mmu_radix_xlate(struct kvm_vcpu *vcpu, 
gva_t eaddr,
 extern void kvmppc_free_radix(struct kvm *kvm);
 extern int kvmppc_radix_init(void);
 extern void kvmppc_radix_exit(void);
+extern int kvm_unmap_radix(struct kvm *kvm, struct kvm_memory_slot *memslot,
+   unsigned long gfn);
+extern int kvm_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot,
+   unsigned long gfn);
+extern int kvm_test_age_radix(struct kvm *kvm, struct kvm_memory_slot *memslot,
+   unsigned long gfn);
 
 /* XXX remove this export when load_last_inst() is generic */
 extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, 
bool data);
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 57690c2..fbb3de4 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -701,12 +701,13 @@ static void kvmppc_rmap_reset(struct kvm *kvm)
srcu_read_unlock(>srcu, srcu_idx);
 }
 
+typedef int (*hva_handler_fn)(struct kvm *kvm, struct kvm_memory_slot *memslot,
+ unsigned long gfn);
+
 static int kvm_handle_hva_range(struct kvm *kvm,
unsigned long start,
unsigned long end,
-   int (*handler)(struct kvm *kvm,
-  unsigned long *rmapp,
-  unsigned long gfn))
+   hva_handler_fn handler)
 {
int ret;
int retval = 0;
@@ -731,9 +732,7 @@ static int kvm_handle_hva_range(struct kvm *kvm,
gfn_end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, memslot);
 
for (; gfn < gfn_end; ++gfn) {
-   gfn_t gfn_offset = gfn - memslot->base_gfn;
-
-   ret = handler(kvm, >arch.rmap[gfn_offset], 
gfn);
+   ret = handler(kvm, memslot, gfn);
retval |= ret;
}
}
@@ -742,20 +741,21 @@ static int kvm_handle_hva_range(struct kvm *kvm,
 }
 
 static int kvm_handle_hva(struct kvm *kvm, unsigned long hva,
- int (*handler)(struct kvm *kvm, unsigned long *rmapp,
-unsigned long gfn))
+ hva_handler_fn handler)
 {
return kvm_handle_hva_range(kvm, hva, hva + 1, handler);
 }
 
-static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp,
+static int kvm_unmap_rmapp(struct kvm *kvm, struct kvm_memory_slot *memslot,
   unsigned long gfn)
 {
struct revmap_entry *rev = kvm->arch.revmap;
unsigned long h, i, j;
__be64 *hptep;
unsigned long ptel, psize, rcbits;
+   unsigned long *rmapp;
 
+   rmapp = >arch.rmap[gfn - memslot->base_gfn];
for (;;) {
lock_rmap(rmapp);
if (!(*rmapp & KVMPPC_RMAP_PRESENT)) {
@@ -816,26 +816,36 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long 
*rmapp,
 
 int kvm_unmap_hva_hv(struct kvm *kvm, unsigned long hva)
 {
-   kvm_handle_hva(kvm, hva, kvm_unmap_rmapp);
+   hva_handler_fn handler;
+
+   handler = kvm->arch.radix ? kvm_unmap_radix : kvm_unmap_rmapp;
+   kvm_handle_hva(kvm, hva, handler);
return 0;
 }
 
 int kvm_unmap_hva_range_hv(struct kvm *kvm, unsigned long start, unsigned long 
end)
 {
-   kvm_handle_hva_range(kvm, start, end, kvm_unmap_rmapp);
+   hva_handler_fn handler;
+
+   handler = kvm->arch.radix ? kvm_unmap_radix : kvm_unmap_rmapp;
+   kvm_handle_hva_range(kvm, start, end, handler);
return 0;
 }
 
 void kvmppc_core_flush_memslot_hv(struct kvm *kvm,
  struct kvm_memory_slot *memslot)
 {
-   unsigned long *rmapp;
unsigned long gfn;
unsigned long n;
+   unsigned long *rmapp;
 
-   rmapp = memslot->arch.rmap;
gfn = memslot->base_gfn;
-   for (n = memslot->npages; n; --n) {
+   rmapp =

[PATCH 13/18] KVM: PPC: Book3S HV: Page table construction and page faults for radix guests

2017-01-12 Thread Paul Mackerras

This adds the code to construct the second-level ("partition-scoped" in
architecturese) page tables for guests using the radix MMU.  Apart from
the PGD level, which is allocated when the guest is created, the rest
of the tree is all constructed in response to hypervisor page faults.

As well as hypervisor page faults for missing pages, we also get faults
for reference/change (RC) bits needing to be set, as well as various
other error conditions.  For now, we only set the R or C bit in the
guest page table if the same bit is set in the host PTE for the
backing page.

This code can take advantage of the guest being backed with either
transparent or ordinary 2MB huge pages, and insert 2MB page entries
into the guest page tables.  There is no support for 1GB huge pages
yet.
---
 arch/powerpc/include/asm/kvm_book3s.h  |   8 +
 arch/powerpc/kvm/book3s.c  |   1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c|   7 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 385 +
 arch/powerpc/kvm/book3s_hv.c   |  17 +-
 5 files changed, 415 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 7adfcc0..ff5cd5c 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -170,6 +170,8 @@ extern int kvmppc_book3s_hv_page_fault(struct kvm_run *run,
unsigned long status);
 extern long kvmppc_hv_find_lock_hpte(struct kvm *kvm, gva_t eaddr,
unsigned long slb_v, unsigned long valid);
+extern int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
+   unsigned long gpa, gva_t ea, int is_store);
 
 extern void kvmppc_mmu_hpte_cache_map(struct kvm_vcpu *vcpu, struct hpte_cache 
*pte);
 extern struct hpte_cache *kvmppc_mmu_hpte_cache_next(struct kvm_vcpu *vcpu);
@@ -182,8 +184,14 @@ extern void kvmppc_mmu_hpte_sysexit(void);
 extern int kvmppc_mmu_hv_init(void);
 extern int kvmppc_book3s_hcall_implemented(struct kvm *kvm, unsigned long hc);
 
+extern int kvmppc_book3s_radix_page_fault(struct kvm_run *run,
+   struct kvm_vcpu *vcpu,
+   unsigned long ea, unsigned long dsisr);
 extern int kvmppc_mmu_radix_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
struct kvmppc_pte *gpte, bool data, bool iswrite);
+extern void kvmppc_free_radix(struct kvm *kvm);
+extern int kvmppc_radix_init(void);
+extern void kvmppc_radix_exit(void);
 
 /* XXX remove this export when load_last_inst() is generic */
 extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, 
bool data);
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 019f008..b6b5c18 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -239,6 +239,7 @@ void kvmppc_core_queue_data_storage(struct kvm_vcpu *vcpu, 
ulong dar,
kvmppc_set_dsisr(vcpu, flags);
kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_DATA_STORAGE);
 }
+EXPORT_SYMBOL_GPL(kvmppc_core_queue_data_storage); /* used by kvm_hv */
 
 void kvmppc_core_queue_inst_storage(struct kvm_vcpu *vcpu, ulong flags)
 {
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index c208bf3..57690c2 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -395,8 +395,8 @@ static int instruction_is_store(unsigned int instr)
return (instr & mask) != 0;
 }
 
-static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
- unsigned long gpa, gva_t ea, int is_store)
+int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu,
+  unsigned long gpa, gva_t ea, int is_store)
 {
u32 last_inst;
 
@@ -461,6 +461,9 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
unsigned long rcbits;
long mmio_update;
 
+   if (kvm_is_radix(kvm))
+   return kvmppc_book3s_radix_page_fault(run, vcpu, ea, dsisr);
+
/*
 * Real-mode code has already searched the HPT and found the
 * entry we're interested in.  Lock the entry and check that
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c 
b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 9091407..865ea9b 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -137,3 +137,388 @@ int kvmppc_mmu_radix_xlate(struct kvm_vcpu *vcpu, gva_t 
eaddr,
return 0;
 }
 
+#ifdef CONFIG_PPC_64K_PAGES
+#define MMU_BASE_PSIZE MMU_PAGE_64K
+#else
+#define MMU_BASE_PSIZE MMU_PAGE_4K
+#endif
+
+static void kvmppc_radix_tlbie_page(struct kvm *kvm, unsigned long addr,
+   unsigned int pshift)
+{
+   int psize = MMU_BASE_PSIZE;
+
+   if (pshift >= PMD_SHIFT)
+   psize = MMU_PAGE_2M;
+   addr &= ~0xfffUL;
+   addr |=

[PATCH 12/18] KVM: PPC: Book3S HV: Modify guest entry/exit paths to handle radix guests

2017-01-12 Thread Paul Mackerras

This adds code to  branch around the parts that radix guests don't
need - clearing and loading the SLB with the guest SLB contents,
flushing the TLB on first entry on each physical CPU, and saving
the guest SLB contents on exit.

Since the host is now using radix, we need to save and restore the
host value for the PID register.

On hypervisor data/instruction storage interrupts, we don't do the
guest HPT lookup on radix, but just save the guest physical address
for the fault (from the ASDR register) in the vcpu struct.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_host.h |  1 +
 arch/powerpc/kernel/asm-offsets.c   |  2 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 58 ++---
 3 files changed, 50 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index fb73518..da1421a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -606,6 +606,7 @@ struct kvm_vcpu_arch {
ulong fault_dar;
u32 fault_dsisr;
unsigned long intr_msr;
+   ulong fault_gpa;/* guest real address of page fault (POWER9) */
 #endif
 
 #ifdef CONFIG_BOOKE
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 0601e6a..3afa0ad 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -498,6 +498,7 @@ int main(void)
DEFINE(KVM_NEED_FLUSH, offsetof(struct kvm, arch.need_tlb_flush.bits));
DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls));
DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v));
+   DEFINE(KVM_RADIX, offsetof(struct kvm, arch.radix));
DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr));
DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar));
DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr));
@@ -537,6 +538,7 @@ int main(void)
DEFINE(VCPU_SLB_NR, offsetof(struct kvm_vcpu, arch.slb_nr));
DEFINE(VCPU_FAULT_DSISR, offsetof(struct kvm_vcpu, arch.fault_dsisr));
DEFINE(VCPU_FAULT_DAR, offsetof(struct kvm_vcpu, arch.fault_dar));
+   DEFINE(VCPU_FAULT_GPA, offsetof(struct kvm_vcpu, arch.fault_gpa));
DEFINE(VCPU_INTR_MSR, offsetof(struct kvm_vcpu, arch.intr_msr));
DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
DEFINE(VCPU_TRAP, offsetof(struct kvm_vcpu, arch.trap));
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 9338a81..f638f3e 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -518,6 +518,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 /* Stack frame offsets */
 #define STACK_SLOT_TID (112-16)
 #define STACK_SLOT_PSSCR   (112-24)
+#define STACK_SLOT_PID (112-32)
 
 .global kvmppc_hv_entry
 kvmppc_hv_entry:
@@ -530,6 +531,7 @@ kvmppc_hv_entry:
 * R1 = host R1
 * R2 = TOC
 * all other volatile GPRS = free
+* Does not preserve non-volatile GPRs or CR fields
 */
mflrr0
std r0, PPC_LR_STKOFF(r1)
@@ -549,32 +551,38 @@ kvmppc_hv_entry:
bl  kvmhv_start_timing
 1:
 #endif
-   /* Clear out SLB */
+
+   /* Use cr7 as an indication of radix mode */
+   ld  r5, HSTATE_KVM_VCORE(r13)
+   ld  r9, VCORE_KVM(r5)   /* pointer to struct kvm */
+   lbz r0, KVM_RADIX(r9)
+   cmpwi   cr7, r0, 0
+
+   /* Clear out SLB if hash */
+   bne cr7, 2f
li  r6,0
slbmte  r6,r6
slbia
ptesync
-
+2:
/*
 * POWER7/POWER8 host -> guest partition switch code.
 * We don't have to lock against concurrent tlbies,
 * but we do have to coordinate across hardware threads.
 */
/* Set bit in entry map iff exit map is zero. */
-   ld  r5, HSTATE_KVM_VCORE(r13)
li  r7, 1
lbz r6, HSTATE_PTID(r13)
sld r7, r7, r6
-   addir9, r5, VCORE_ENTRY_EXIT
-21:lwarx   r3, 0, r9
+   addir8, r5, VCORE_ENTRY_EXIT
+21:lwarx   r3, 0, r8
cmpwi   r3, 0x100   /* any threads starting to exit? */
bge secondary_too_late  /* if so we're too late to the party */
or  r3, r3, r7
-   stwcx.  r3, 0, r9
+   stwcx.  r3, 0, r8
bne 21b
 
/* Primary thread switches to guest partition. */
-   ld  r9,VCORE_KVM(r5)/* pointer to struct kvm */
cmpwi   r6,0
bne 10f
lwz r7,KVM_LPID(r9)
@@ -589,6 +597,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
isync
 
/* See if we need to flush the TLB */
+   bne cr7, 22f/* skip this for radix */
lhz r6,PACAPACAINDEX(r13)   /* test_bit(cpu, need_tlb_flush) */

[PATCH 11/18] KVM: PPC: Book3S HV: Add basic infrastructure for radix guests

2017-01-12 Thread Paul Mackerras

This adds a field in struct kvm_arch and an inline helper to
indicate whether a guest is a radix guest or not, plus a new file
to contain the radix MMU code, which currently contains just a
translate function which knows how to traverse the guest page
tables to translate an address.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_book3s.h|   3 +
 arch/powerpc/include/asm/kvm_book3s_64.h |   6 ++
 arch/powerpc/include/asm/kvm_host.h  |   2 +
 arch/powerpc/kvm/Makefile|   3 +-
 arch/powerpc/kvm/book3s_64_mmu_hv.c  |  10 ++-
 arch/powerpc/kvm/book3s_64_mmu_radix.c   | 139 +++
 6 files changed, 160 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_mmu_radix.c

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 5cf306a..7adfcc0 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -182,6 +182,9 @@ extern void kvmppc_mmu_hpte_sysexit(void);
 extern int kvmppc_mmu_hv_init(void);
 extern int kvmppc_book3s_hcall_implemented(struct kvm *kvm, unsigned long hc);
 
+extern int kvmppc_mmu_radix_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
+   struct kvmppc_pte *gpte, bool data, bool iswrite);
+
 /* XXX remove this export when load_last_inst() is generic */
 extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong *eaddr, int size, void *ptr, 
bool data);
 extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int 
vec);
diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index 8482921..0db010c 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -36,6 +36,12 @@ static inline void svcpu_put(struct 
kvmppc_book3s_shadow_vcpu *svcpu)
 #endif
 
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
+
+static inline bool kvm_is_radix(struct kvm *kvm)
+{
+   return kvm->arch.radix;
+}
+
 #define KVM_DEFAULT_HPT_ORDER  24  /* 16MB HPT by default */
 #endif
 
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 944532d..fb73518 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -264,6 +264,8 @@ struct kvm_arch {
atomic_t hpte_mod_interest;
cpumask_t need_tlb_flush;
int hpt_cma_alloc;
+   u8 radix;
+   pgd_t *pgtable;
u64 process_table;
struct dentry *debugfs_dir;
struct dentry *htab_dentry;
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 7dd89b7..b87ccde 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -70,7 +70,8 @@ endif
 kvm-hv-y += \
book3s_hv.o \
book3s_hv_interrupts.o \
-   book3s_64_mmu_hv.o
+   book3s_64_mmu_hv.o \
+   book3s_64_mmu_radix.o
 
 kvm-book3s_64-builtin-xics-objs-$(CONFIG_KVM_XICS) := \
book3s_hv_rm_xics.o
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index b795dd1..c208bf3 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -119,6 +119,9 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, u32 
*htab_orderp)
long err = -EBUSY;
long order;
 
+   if (kvm_is_radix(kvm))
+   return -EINVAL;
+
mutex_lock(>lock);
if (kvm->arch.hpte_setup_done) {
kvm->arch.hpte_setup_done = 0;
@@ -157,7 +160,7 @@ void kvmppc_free_hpt(struct kvm *kvm)
if (kvm->arch.hpt_cma_alloc)
kvm_release_hpt(virt_to_page(kvm->arch.hpt_virt),
1 << (kvm->arch.hpt_order - PAGE_SHIFT));
-   else
+   else if (kvm->arch.hpt_virt)
free_pages(kvm->arch.hpt_virt,
   kvm->arch.hpt_order - PAGE_SHIFT);
 }
@@ -1675,7 +1678,10 @@ void kvmppc_mmu_book3s_hv_init(struct kvm_vcpu *vcpu)
 
vcpu->arch.slb_nr = 32; /* POWER7/POWER8 */
 
-   mmu->xlate = kvmppc_mmu_book3s_64_hv_xlate;
+   if (kvm_is_radix(vcpu->kvm))
+   mmu->xlate = kvmppc_mmu_radix_xlate;
+   else
+   mmu->xlate = kvmppc_mmu_book3s_64_hv_xlate;
mmu->reset_msr = kvmppc_mmu_book3s_64_hv_reset_msr;
 
vcpu->arch.hflags |= BOOK3S_HFLAG_SLB;
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c 
b/arch/powerpc/kvm/book3s_64_mmu_radix.c
new file mode 100644
index 000..9091407
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -0,0 +1,139 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * Copyright 2016 Paul Mackerras, IBM Corp. 
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * Supported

[PATCH 10/18] KVM: PPC: Book3S HV: Set process table for HPT guests on POWER9

2017-01-12 Thread Paul Mackerras

This adds the implementation of the KVM_PPC_CONFIGURE_V3_MMU ioctl
for HPT guests on POWER9.  With this, we can return 1 for the
KVM_CAP_PPC_MMU_HASH_V3 capability.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/kvm_host.h |  1 +
 arch/powerpc/kvm/book3s_hv.c| 35 +++
 arch/powerpc/kvm/powerpc.c  |  2 +-
 3 files changed, 33 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index e59b172..944532d 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -264,6 +264,7 @@ struct kvm_arch {
atomic_t hpte_mod_interest;
cpumask_t need_tlb_flush;
int hpt_cma_alloc;
+   u64 process_table;
struct dentry *debugfs_dir;
struct dentry *htab_dentry;
 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 1736f87..6bd0f4a 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3092,8 +3092,8 @@ static void kvmppc_setup_partition_table(struct kvm *kvm)
/* HTABSIZE and HTABORG fields */
dw0 |= kvm->arch.sdr1;
 
-   /* Second dword has GR=0; other fields are unused since UPRT=0 */
-   dw1 = 0;
+   /* Second dword as set by userspace */
+   dw1 = kvm->arch.process_table;
 
mmu_partition_table_set_entry(kvm->arch.lpid, dw0, dw1);
 }
@@ -3658,10 +3658,37 @@ static void init_default_hcalls(void)
}
 }
 
-/* dummy implementations for now */
 static int kvmhv_configure_mmu(struct kvm *kvm, struct kvm_ppc_mmuv3_cfg *cfg)
 {
-   return -EINVAL;
+   unsigned long lpcr;
+
+   /* If not on a POWER9, reject it */
+   if (!cpu_has_feature(CPU_FTR_ARCH_300))
+   return -ENODEV;
+
+   /* If any unknown flags set, reject it */
+   if (cfg->flags & ~(KVM_PPC_MMUV3_RADIX | KVM_PPC_MMUV3_GTSE))
+   return -EINVAL;
+
+   /* We can't do radix yet */
+   if (cfg->flags & KVM_PPC_MMUV3_RADIX)
+   return -EINVAL;
+
+   /* GR (guest radix) bit in process_table field must match */
+   if (cfg->process_table & PATB_GR)
+   return -EINVAL;
+
+   /* Process table size field must be reasonable, i.e. <= 24 */
+   if ((cfg->process_table & PRTS_MASK) > 24)
+   return -EINVAL;
+
+   kvm->arch.process_table = cfg->process_table;
+   kvmppc_setup_partition_table(kvm);
+
+   lpcr = (cfg->flags & KVM_PPC_MMUV3_GTSE) ? LPCR_GTSE : 0;
+   kvmppc_update_lpcr(kvm, lpcr, LPCR_GTSE);
+
+   return 0;
 }
 
 static int kvmhv_get_rmmu_info(struct kvm *kvm, struct kvm_ppc_rmmu_info *info)
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 38c0d15..1476a48 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -569,7 +569,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = !!(0 && hv_enabled && radix_enabled());
break;
case KVM_CAP_PPC_MMU_HASH_V3:
-   r = !!(0 && hv_enabled && !radix_enabled() &&
+   r = !!(hv_enabled && !radix_enabled() &&
   cpu_has_feature(CPU_FTR_ARCH_300));
break;
 #endif
-- 
2.7.4

[PATCH 09/18] KVM: PPC: Book3S HV: Add userspace interfaces for POWER9 MMU

2017-01-12 Thread Paul Mackerras

This adds two capabilities and two ioctls to allow userspace to
find out about and configure the POWER9 MMU in a guest.  The two
capabilities tell userspace whether KVM can support a guest using
the radix MMU, or using the hashed page table (HPT) MMU with a
process table and segment tables.  (Note that the MMUs in the
POWER9 processor cores do not use the process and segment tables
when in HPT mode, but the nest MMU does).

The KVM_PPC_CONFIGURE_V3_MMU ioctl allows userspace to specify
whether a guest will use the radix MMU or the HPT MMU, and to
specify the size and location (in guest space) of the process
table.

The KVM_PPC_GET_RMMU_INFO ioctl gives userspace information about
the radix MMU.  It returns a list of supported radix tree geometries
(base page size and number of bits indexed at each level of the
radix tree) and the encoding used to specify the various page
sizes for the TLB invalidate entry instruction.

Initially, both capabilities return 0 and the ioctls return -EINVAL,
until the necessary infrastructure for them to operate correctly
is added.

Signed-off-by: Paul Mackerras 
---
 Documentation/virtual/kvm/api.txt   | 83 +
 arch/powerpc/include/asm/kvm_ppc.h  |  2 +
 arch/powerpc/include/uapi/asm/kvm.h | 20 +
 arch/powerpc/kvm/book3s_hv.c| 13 ++
 arch/powerpc/kvm/powerpc.c  | 32 ++
 include/uapi/linux/kvm.h|  6 +++
 6 files changed, 156 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 03145b7..4470671 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3201,6 +3201,71 @@ struct kvm_reinject_control {
 pit_reinject = 0 (!reinject mode) is recommended, unless running an old
 operating system that uses the PIT for timing (e.g. Linux 2.4.x).
 
+4.99 KVM_PPC_CONFIGURE_V3_MMU
+
+Capability: KVM_CAP_PPC_RADIX_MMU or KVM_CAP_PPC_HASH_MMU_V3
+Architectures: ppc
+Type: vm ioctl
+Parameters: struct kvm_ppc_mmuv3_cfg (in)
+Returns: 0 on success,
+ -EFAULT if struct kvm_ppc_mmuv3_cfg cannot be read,
+ -EINVAL if the configuration is invalid
+
+This ioctl controls whether the guest will use radix or HPT (hashed
+page table) translation, and sets the pointer to the process table for
+the guest.
+
+struct kvm_ppc_mmuv3_cfg {
+   __u64   flags;
+   __u64   process_table;
+};
+
+There are two bits that can be set in flags; KVM_PPC_MMUV3_RADIX and
+KVM_PPC_MMUV3_GTSE.  KVM_PPC_MMUV3_RADIX, if set, configures the guest
+to use radix tree translation, and if clear, to use HPT translation.
+KVM_PPC_MMUV3_GTSE, if set and if KVM permits it, configures the guest
+to be able to use the global TLB and SLB invalidation instructions;
+if clear, the guest may not use these instructions.
+
+The process_table field specifies the address and size of the guest
+process table, which is in the guest's space.  This field is formatted
+as the second doubleword of the partition table entry, as defined in
+the Power ISA V3.00, Book III section 5.7.6.1.
+
+4.100 KVM_PPC_GET_RMMU_INFO
+
+Capability: KVM_CAP_PPC_RADIX_MMU
+Architectures: ppc
+Type: vm ioctl
+Parameters: struct kvm_ppc_rmmu_info (out)
+Returns: 0 on success,
+-EFAULT if struct kvm_ppc_rmmu_info cannot be written,
+-EINVAL if no useful information can be returned
+
+This ioctl returns a structure containing two things: (a) a list
+containing supported radix tree geometries, and (b) a list that maps
+page sizes to put in the "AP" (actual page size) field for the tlbie
+(TLB invalidate entry) instruction.
+
+struct kvm_ppc_rmmu_info {
+   struct kvm_ppc_radix_geom {
+   __u8page_shift;
+   __u8level_bits[4];
+   __u8pad[3];
+   }   geometries[8];
+   __u32   ap_encodings[8];
+};
+
+The geometries[] field gives up to 8 supported geometries for the
+radix page table, in terms of the log base 2 of the smallest page
+size, and the number of bits indexed at each level of the tree, from
+the PTE level up to the PGD level in that order.  Any unused entries
+will have 0 in the page_shift field.
+
+The ap_encodings gives the supported page sizes and their AP field
+encodings, encoded with the AP value in the top 3 bits and the log
+base 2 of the page size in the bottom 6 bits.
+
 5. The kvm_run structure
 
 
@@ -3942,3 +4007,21 @@ In order to use SynIC, it has to be activated by setting 
this
 capability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this
 will disable the use of APIC hardware virtualization even if supported
 by the CPU, as it's incompatible with SynIC auto-EOI behavior.
+
+8.3 KVM_CAP_PPC_RADIX_MMU
+
+Architectures: ppc
+
+This capability, if KVM_CHECK_EXTENSION indicates that it is
+available, means that that the kernel can support guests using the
+radix MMU defined in Power ISA V3.00 (as implemented in the POWER9
+processor).
+

[PATCH 08/18] KVM: PPC: Book3S HV: Don't try to signal cpu -1

2017-01-12 Thread Paul Mackerras

If the target vcpu for kvmppc_fast_vcpu_kick_hv() is not running on
any CPU, then we will have vcpu->arch.thread_cpu == -1, and as it
happens, kvmppc_fast_vcpu_kick_hv will call kvmppc_ipi_thread with
-1 as the cpu argument.  Although this is not meaningful, in the past,
before commit 1704a81ccebc ("KVM: PPC: Book3S HV: Use msgsnd for IPIs
to other cores on POWER9", 2016-11-18), it was harmless because CPU
-1 is not in the same core as any real CPU thread.  On a POWER9,
however, we don't do the "same core" check, so we were trying to
do a msgsnd to thread -1, which is invalid.  To avoid this, we add
a check to see that vcpu->arch.thread_cpu is >= 0 before calling
kvmppc_ipi_thread() with it.  Since vcpu->arch.thread_vcpu can change
asynchronously, we use READ_ONCE to ensure that the value we check is
the same value that we use as the argument to kvmppc_ipi_thread().

Fixes: 1704a81ccebc ("KVM: PPC: Book3S HV: Use msgsnd for IPIs to other cores 
on POWER9")
Signed-off-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_hv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ec34e39..8d9cc07 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -182,7 +182,8 @@ static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
++vcpu->stat.halt_wakeup;
}
 
-   if (kvmppc_ipi_thread(vcpu->arch.thread_cpu))
+   cpu = READ_ONCE(vcpu->arch.thread_cpu);
+   if (cpu >= 0 && kvmppc_ipi_thread(cpu))
return;
 
/* CPU points to the first thread of the core */
-- 
2.7.4

[PATCH 07/18] powerpc/64: Make type of partition table flush depend on partition type

2017-01-12 Thread Paul Mackerras

When changing a partition table entry on POWER9, we do a particular
form of the tlbie instruction which flushes all TLBs and caches of
the partition table for a given logical partition ID (LPID).
This instruction has a field in the instruction word, labelled R
(radix), which should be 1 if the partition was previously a radix
partition and 0 if it was a HPT partition.  This implements that
logic.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/mm/pgtable_64.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index 8bca7f5..d6b5e5c 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -454,13 +454,23 @@ void __init mmu_partition_table_init(void)
 void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
   unsigned long dw1)
 {
+   unsigned long old = be64_to_cpu(partition_tb[lpid].patb0);
+
partition_tb[lpid].patb0 = cpu_to_be64(dw0);
partition_tb[lpid].patb1 = cpu_to_be64(dw1);
 
-   /* Global flush of TLBs and partition table caches for this lpid */
+   /*
+* Global flush of TLBs and partition table caches for this lpid.
+* The type of flush (hash or radix) depends on what the previous
+* use of this partition ID was, not the new use.
+*/
asm volatile("ptesync" : : : "memory");
-   asm volatile(PPC_TLBIE_5(%0,%1,2,0,0) : :
-"r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
+   if (old & PATB_HR)
+   asm volatile(PPC_TLBIE_5(%0,%1,2,0,1) : :
+"r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
+   else
+   asm volatile(PPC_TLBIE_5(%0,%1,2,0,0) : :
+"r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
asm volatile("eieio; tlbsync; ptesync" : : : "memory");
 }
 EXPORT_SYMBOL_GPL(mmu_partition_table_set_entry);
-- 
2.7.4

[PATCH 06/18] powerpc/64: Export pgtable_cache and pgtable_cache_add for KVM

2017-01-12 Thread Paul Mackerras

This exports the pgtable_cache array and the pgtable_cache_add
function so that HV KVM can use them for allocating radix page
tables for guests.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/mm/init-common.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/init-common.c b/arch/powerpc/mm/init-common.c
index a175cd8..2be5dc2 100644
--- a/arch/powerpc/mm/init-common.c
+++ b/arch/powerpc/mm/init-common.c
@@ -41,6 +41,7 @@ static void pmd_ctor(void *addr)
 }
 
 struct kmem_cache *pgtable_cache[MAX_PGTABLE_INDEX_SIZE];
+EXPORT_SYMBOL_GPL(pgtable_cache);  /* used by kvm_hv module */
 
 /*
  * Create a kmem_cache() for pagetables.  This is not used for PTE
@@ -82,7 +83,7 @@ void pgtable_cache_add(unsigned shift, void (*ctor)(void *))
pgtable_cache[shift - 1] = new;
pr_debug("Allocated pgtable cache for order %d\n", shift);
 }
-
+EXPORT_SYMBOL_GPL(pgtable_cache_add);  /* used by kvm_hv module */
 
 void pgtable_cache_init(void)
 {
-- 
2.7.4

[PATCH 05/18] powerpc/64: More definitions for POWER9

2017-01-12 Thread Paul Mackerras

This adds definitions for bits in the DSISR register which are used
by POWER9 for various translation-related exception conditions, and
for some more bits in the partition table entry that will be needed
by KVM.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/book3s/64/mmu.h | 12 +++-
 arch/powerpc/include/asm/reg.h   |  4 
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h 
b/arch/powerpc/include/asm/book3s/64/mmu.h
index e8cbdc0..d827825 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -44,10 +44,20 @@ struct patb_entry {
 };
 extern struct patb_entry *partition_tb;
 
+/* Bits in patb0 field */
 #define PATB_HR(1UL << 63)
-#define PATB_GR(1UL << 63)
 #define RPDB_MASK  0x000fUL
 #define RPDB_SHIFT (1UL << 8)
+#define RTS1_SHIFT 61  /* top 2 bits of radix tree size */
+#define RTS1_MASK  (3UL << RTS1_SHIFT)
+#define RTS2_SHIFT 5   /* bottom 3 bits of radix tree size */
+#define RTS2_MASK  (7UL << RTS2_SHIFT)
+#define RPDS_MASK  0x1f/* root page dir. size field */
+
+/* Bits in patb1 field */
+#define PATB_GR(1UL << 63) /* guest uses radix; must match 
HR */
+#define PRTS_MASK  0x1f/* process table size field */
+
 /*
  * Limit process table to PAGE_SIZE table. This
  * also limit the max pid we can support.
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 0d4531a..aa44a83 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -274,10 +274,14 @@
 #define SPRN_DSISR 0x012   /* Data Storage Interrupt Status Register */
 #define   DSISR_NOHPTE 0x4000  /* no translation found */
 #define   DSISR_PROTFAULT  0x0800  /* protection fault */
+#define   DSISR_BADACCESS  0x0400  /* bad access to CI or G */
 #define   DSISR_ISSTORE0x0200  /* access was a store */
 #define   DSISR_DABRMATCH  0x0040  /* hit data breakpoint */
 #define   DSISR_NOSEGMENT  0x0020  /* SLB miss */
 #define   DSISR_KEYFAULT   0x0020  /* Key fault */
+#define   DSISR_UNSUPP_MMU 0x0008  /* Unsupported MMU config */
+#define   DSISR_SET_RC 0x0004  /* Failed setting of R/C bits */
+#define   DSISR_PGDIRFAULT  0x0002  /* Fault on page directory */
 #define SPRN_TBRL  0x10C   /* Time Base Read Lower Register (user, R/O) */
 #define SPRN_TBRU  0x10D   /* Time Base Read Upper Register (user, R/O) */
 #define SPRN_CIR   0x11B   /* Chip Information Register (hyper, R/0) */
-- 
2.7.4

[PATCH 04/18] powerpc/64: Enable use of radix MMU under hypervisor on POWER9

2017-01-12 Thread Paul Mackerras

To use radix as a guest, we first need to tell the hypervisor via
the ibm,client-architecture call first that we support POWER9 and
architecture v3.00, and that we can do either radix or hash and
that we would like to choose later using an hcall (the
H_REGISTER_PROC_TBL hcall).

Then we need to check whether the hypervisor agreed to us using
radix.  We need to do this very early on in the kernel boot process
before any of the MMU initialization is done.  If the hypervisor
doesn't agree, we can't use radix and therefore clear the radix
MMU feature bit.

Later, when we have set up our process table, which points to the
radix tree for each process, we need to install that using the
H_REGISTER_PROC_TBL hcall.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/book3s/64/mmu.h |  2 ++
 arch/powerpc/include/asm/hvcall.h| 11 +++
 arch/powerpc/include/asm/prom.h  |  9 +
 arch/powerpc/kernel/prom_init.c  | 18 +-
 arch/powerpc/mm/init_64.c| 12 +++-
 arch/powerpc/mm/pgtable-radix.c  |  2 ++
 arch/powerpc/platforms/pseries/lpar.c| 29 +
 7 files changed, 77 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h 
b/arch/powerpc/include/asm/book3s/64/mmu.h
index 8afb0e0..e8cbdc0 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -138,5 +138,7 @@ static inline void setup_initial_memory_limit(phys_addr_t 
first_memblock_base,
 extern int (*register_process_table)(unsigned long base, unsigned long 
page_size,
 unsigned long tbl_size);
 
+extern void radix_init_pseries(void);
+
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_POWERPC_BOOK3S_64_MMU_H_ */
diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index 77ff1ba..54d11b3 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -276,6 +276,7 @@
 #define H_GET_MPP_X0x314
 #define H_SET_MODE 0x31C
 #define H_CLEAR_HPT0x358
+#define H_REGISTER_PROC_TBL0x37C
 #define H_SIGNAL_SYS_RESET 0x380
 #define MAX_HCALL_OPCODE   H_SIGNAL_SYS_RESET
 
@@ -313,6 +314,16 @@
 #define H_SIGNAL_SYS_RESET_ALL_OTHERS  -2
 /* >= 0 values are CPU number */
 
+/* Flag values used in H_REGISTER_PROC_TBL hcall */
+#define PROC_TABLE_OP_MASK 0x18
+#define PROC_TABLE_DEREG   0x10
+#define PROC_TABLE_NEW 0x18
+#define PROC_TABLE_TYPE_MASK   0x06
+#define PROC_TABLE_HPT_SLB 0x00
+#define PROC_TABLE_HPT_PT  0x02
+#define PROC_TABLE_RADIX   0x04
+#define PROC_TABLE_GTSE0x01
+
 #ifndef __ASSEMBLY__
 
 /**
diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index e6d83d0..8af2546 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -121,6 +121,8 @@ struct of_drconf_cell {
 #define OV1_PPC_2_06   0x02/* set if we support PowerPC 2.06 */
 #define OV1_PPC_2_07   0x01/* set if we support PowerPC 2.07 */
 
+#define OV1_PPC_3_00   0x80/* set if we support PowerPC 3.00 */
+
 /* Option vector 2: Open Firmware options supported */
 #define OV2_REAL_MODE  0x20/* set if we want OF in real mode */
 
@@ -155,6 +157,13 @@ struct of_drconf_cell {
 #define OV5_PFO_HW_842 0x1140  /* PFO Compression Accelerator */
 #define OV5_PFO_HW_ENCR0x1120  /* PFO Encryption Accelerator */
 #define OV5_SUB_PROCESSORS 0x1501  /* 1,2,or 4 Sub-Processors supported */
+#define OV5_XIVE_EXPLOIT   0x1701  /* XIVE exploitation supported */
+#define OV5_MMU_RADIX_300  0x1880  /* ISA v3.00 radix MMU supported */
+#define OV5_MMU_HASH_300   0x1840  /* ISA v3.00 hash MMU supported */
+#define OV5_MMU_SEGM_RADIX 0x1820  /* radix mode (no segmentation) */
+#define OV5_MMU_PROC_TBL   0x1810  /* hcall selects SLB or proc table */
+#define OV5_MMU_SLB0x1800  /* always use SLB */
+#define OV5_MMU_GTSE   0x1808  /* Guest translation shootdown */
 
 /* Option Vector 6: IBM PAPR hints */
 #define OV6_LINUX  0x02/* Linux is our OS */
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index ec47a93..358d43f 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -649,6 +649,7 @@ static void __init early_cmdline_parse(void)
 struct option_vector1 {
u8 byte1;
u8 arch_versions;
+   u8 arch_versions3;
 } __packed;
 
 struct option_vector2 {
@@ -691,6 +692,9 @@ struct option_vector5 {
u8 reserved2;
__be16 reserved3;
u8 subprocessors;
+   u8 byte22;
+   u8 intarch;
+   u8 mmu;
 } __packed;
 
 struct option_vector6 {
@@ -700,7 +704,7 @@ struct option_vector6 {
 } __packed;
 
 struct ibm_arch_vec {
-   struct { u32 mask, val; } pvrs[10];
+

[PATCH 03/18] powerpc/64: Always enable radix support for 64-bit Book 3S kernels

2017-01-12 Thread Paul Mackerras

This removes the ability for the user to choose whether or not to
include support for the radix MMU in kernels built to run on 64-bit
Book 3S machines.  Excluding radix support saves only about 25kiB
of text and 13kiB of data, a total of little over half a page.
Having the option expands the space of option combinations that
need to be tested, which is an ongoing burden on developers.
Given that the space savings are small, let's remove the option.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/platforms/Kconfig.cputype | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index 6e89e5a..de88156 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -334,13 +334,8 @@ config PPC_STD_MMU_64
depends on PPC_STD_MMU && PPC64
 
 config PPC_RADIX_MMU
-   bool "Radix MMU Support"
+   def_bool y
depends on PPC_BOOK3S_64
-   default y
-   help
- Enable support for the Power ISA 3.0 Radix style MMU. Currently this
- is only implemented by IBM Power9 CPUs, if you don't have one of them
- you can probably disable this.
 
 config PPC_MMU_NOHASH
def_bool y
-- 
2.7.4

[PATCH 02/18] powerpc/64: Fixes for the ibm, client-architecture-support options

2017-01-12 Thread Paul Mackerras

This fixes the values for some of the option vector 5 bits in
the ibm,client-architecture-support vector 5.  The "platform
facilities options" bits are in byte 17 not byte 14, so the
upper 8 bits of their definitions need to be 0x11 not 0x0E.
The "sub processor support" option is in byte 21 not byte 15.

When checking whether option bits are set, we should check that
the offset of the byte being checked is less than the vector
length that we got from the hypervisor.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/prom.h   | 8 
 arch/powerpc/platforms/pseries/firmware.c | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index 5e57705..e6d83d0 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -151,10 +151,10 @@ struct of_drconf_cell {
 #define OV5_XCMO   0x0440  /* Page Coalescing */
 #define OV5_TYPE1_AFFINITY 0x0580  /* Type 1 NUMA affinity */
 #define OV5_PRRN   0x0540  /* Platform Resource Reassignment */
-#define OV5_PFO_HW_RNG 0x0E80  /* PFO Random Number Generator */
-#define OV5_PFO_HW_842 0x0E40  /* PFO Compression Accelerator */
-#define OV5_PFO_HW_ENCR0x0E20  /* PFO Encryption Accelerator */
-#define OV5_SUB_PROCESSORS 0x0F01  /* 1,2,or 4 Sub-Processors supported */
+#define OV5_PFO_HW_RNG 0x1180  /* PFO Random Number Generator */
+#define OV5_PFO_HW_842 0x1140  /* PFO Compression Accelerator */
+#define OV5_PFO_HW_ENCR0x1120  /* PFO Encryption Accelerator */
+#define OV5_SUB_PROCESSORS 0x1501  /* 1,2,or 4 Sub-Processors supported */
 
 /* Option Vector 6: IBM PAPR hints */
 #define OV6_LINUX  0x02/* Linux is our OS */
diff --git a/arch/powerpc/platforms/pseries/firmware.c 
b/arch/powerpc/platforms/pseries/firmware.c
index ea7f09b..7d67623 100644
--- a/arch/powerpc/platforms/pseries/firmware.c
+++ b/arch/powerpc/platforms/pseries/firmware.c
@@ -126,7 +126,7 @@ static void __init fw_vec5_feature_init(const char *vec5, 
unsigned long len)
index = OV5_INDX(vec5_fw_features_table[i].feature);
feat = OV5_FEAT(vec5_fw_features_table[i].feature);
 
-   if (vec5[index] & feat)
+   if (index < len && (vec5[index] & feat))
powerpc_firmware_features |=
vec5_fw_features_table[i].val;
}
-- 
2.7.4

[PATCH 01/18] powerpc/64: Don't try to use radix MMU under a hypervisor

2017-01-12 Thread Paul Mackerras

Currently, if the kernel is running on a POWER9 processor under a
hypervisor, it will try to use the radix MMU even though it doesn't
have the necessary code to use radix under a hypervisor (it doesn't
negotiate use of radix, and it doesn't do the H_REGISTER_PROC_TBL
hcall).  The result is that the guest kernel will crash when it tries
to turn on the MMU.

This fixes it by looking for the /chosen/ibm,architecture-vec-5
property, and if it exists, clears the radix MMU feature bit,
before we decide whether to initialize for radix or HPT.  This
property is created by the hypervisor as a result of the guest
calling the ibm,client-architecture-support method to indicate
its capabilities, so it will indicate whether the hypervisor
agreed to us using radix.

Systems without a hypervisor may have this property also (for
example, skiboot creates it), so we check the HV bit in the MSR
to see whether we are running as a guest or not.  If we are in
hypervisor mode, then we can do whatever we like including
using the radix MMU.

The reason for using this property is that in future, when we
have support for using radix under a hypervisor, we will need
to check this property to see whether the hypervisor agreed to
us using radix.

Cc: sta...@vger.kernel.org # v4.8+
Signed-off-by: Paul Mackerras 
---
 arch/powerpc/mm/init_64.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 93abf8a..4d9481e 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -42,6 +42,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -344,12 +346,43 @@ static int __init parse_disable_radix(char *p)
 }
 early_param("disable_radix", parse_disable_radix);
 
+/*
+ * If we're running under a hypervisor, we currently can't do radix
+ * since we don't have the code to do the H_REGISTER_PROC_TBL hcall.
+ * We tell that we're running under a hypervisor by looking for the
+ * /chosen/ibm,architecture-vec-5 property.
+ */
+static void early_check_vec5(void)
+{
+   unsigned long root, chosen;
+   int size;
+   const u8 *vec5;
+
+   root = of_get_flat_dt_root();
+   chosen = of_get_flat_dt_subnode_by_name(root, "chosen");
+   if (chosen == -FDT_ERR_NOTFOUND)
+   return;
+   vec5 = of_get_flat_dt_prop(chosen, "ibm,architecture-vec-5", );
+   if (!vec5)
+   return;
+   cur_cpu_spec->mmu_features &= ~MMU_FTR_TYPE_RADIX;
+}
+
 void __init mmu_early_init_devtree(void)
 {
/* Disable radix mode based on kernel command line. */
if (disable_radix)
cur_cpu_spec->mmu_features &= ~MMU_FTR_TYPE_RADIX;
 
+   /*
+* Check /chosen/ibm,architecture-vec-5 if running as a guest.
+* When running bare-metal, we can use radix if we like
+* even though the ibm,architecture-vec-5 property created by
+* skiboot doesn't have the necessary bits set.
+*/
+   if (early_radix_enabled() && !(mfmsr() & MSR_HV))
+   early_check_vec5();
+
if (early_radix_enabled())
radix__early_init_devtree();
else
-- 
2.7.4

[PATCH 00/18] Support for radix guest and host on POWER9

2017-01-12 Thread Paul Mackerras

The primary purpose of this patch series is to make it possible to run
a guest on POWER9 using the radix MMU under a KVM host that also uses
the radix MMU.  To do this, the guest needs to say that it supports
radix in the ibm,client-architecture-support vector, and if the host
agrees, the guest then needs to call the new H_REGISTER_PROCESS_TABLE
hypercall to tell the hypervisor where its process table is (the
process table contains pointers to the radix trees for each process).

On the host side, we add new KVM interfaces for userspace (e.g. QEMU)
to use to know whether the host supports radix guest, HPT guests, or
both, and for userspace to use to implement H_REGISTER_PROCESS_TABLE.
There are two new capabilities and two new ioctls.  These are added in
patch 9/18.

The host then needs to be able to construct the second-level radix
tree ("partition-scoped" in architecturese) which maps guest real
addresses to host real addresses.  We add PTEs to this table on
hypervisor page faults and remove them in response to MMU notifier
callbacks.

The patch series also includes some improvements for HPT guests
running under a HPT host.

Currently, the MMU type of the guest (radix or HPT) must be the same
as the host.  That is, if the host is booted in radix mode, it can
only run radix guests.  If the host is booted in HPT mode (e.g. by
putting "disable_radix" on the kernel command line), it can only run
HPT guests.

The patch series is against 4.10-rc3.

Paul.

 Documentation/virtual/kvm/api.txt |  83 
 arch/powerpc/include/asm/book3s/64/mmu.h  |  14 +-
 arch/powerpc/include/asm/hvcall.h |  11 +
 arch/powerpc/include/asm/kvm_book3s.h |  26 +-
 arch/powerpc/include/asm/kvm_book3s_64.h  |   6 +
 arch/powerpc/include/asm/kvm_host.h   |   4 +
 arch/powerpc/include/asm/kvm_ppc.h|   2 +
 arch/powerpc/include/asm/prom.h   |  17 +-
 arch/powerpc/include/asm/reg.h|   4 +
 arch/powerpc/include/uapi/asm/kvm.h   |  20 +
 arch/powerpc/kernel/asm-offsets.c |   2 +
 arch/powerpc/kernel/prom_init.c   |  18 +-
 arch/powerpc/kvm/Makefile |   3 +-
 arch/powerpc/kvm/book3s.c |   1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c   | 110 +++--
 arch/powerpc/kvm/book3s_64_mmu_radix.c| 716 ++
 arch/powerpc/kvm/book3s_hv.c  | 168 +--
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   |  14 +
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   |  66 ++-
 arch/powerpc/kvm/powerpc.c|  32 ++
 arch/powerpc/mm/init-common.c |   3 +-
 arch/powerpc/mm/init_64.c |  35 ++
 arch/powerpc/mm/pgtable-radix.c   |   2 +
 arch/powerpc/mm/pgtable_64.c  |  16 +-
 arch/powerpc/platforms/Kconfig.cputype|   7 +-
 arch/powerpc/platforms/pseries/firmware.c |   2 +-
 arch/powerpc/platforms/pseries/lpar.c |  29 ++
 include/uapi/linux/kvm.h  |   6 +
 28 files changed, 1318 insertions(+), 99 deletions(-)

Re: [PATCH v4 2/2] KVM: PPC: Exit guest upon MCE when FWNMI capability is enabled

2017-01-12 Thread Balbir Singh

On Mon, Jan 09, 2017 at 05:10:45PM +0530, Aravinda Prasad wrote:
> Enhance KVM to cause a guest exit with KVM_EXIT_NMI
> exit reason upon a machine check exception (MCE) in
> the guest address space if the KVM_CAP_PPC_FWNMI
> capability is enabled (instead of delivering a 0x200
> interrupt to guest). This enables QEMU to build error
> log and deliver machine check exception to guest via
> guest registered machine check handler.
> 
> This approach simplifies the delivery of machine
> check exception to guest OS compared to the earlier
> approach of KVM directly invoking 0x200 guest interrupt
> vector.
> 
> This design/approach is based on the feedback for the
> QEMU patches to handle machine check exception. Details
> of earlier approach of handling machine check exception
> in QEMU and related discussions can be found at:
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
> 
> Note:
> 
> This patch introduces a hook which is invoked at the time
> of guest exit to facilitate the host-side handling of
> machine check exception before the exception is passed
> on to the guest. Hence, the host-side handling which was
> performed earlier via machine_check_fwnmi is removed.
> 
> The reasons for this approach is (i) it is not possible
> to distinguish whether the exception occurred in the
> guest or the host from the pt_regs passed on the
> machine_check_exception(). Hence machine_check_exception()
> calls panic, instead of passing on the exception to
> the guest, if the machine check exception is not
> recoverable. (ii) the approach introduced in this
> patch gives opportunity to the host kernel to perform
> actions in virtual mode before passing on the exception
> to the guest. This approach does not require complex
> tweaks to machine_check_fwnmi and friends.

It would be good to qualify the different types of MCE
and what action we expect across hypervisor and guest.

> 
> Signed-off-by: Aravinda Prasad 
> ---
>  arch/powerpc/kvm/book3s_hv.c|   27 +-
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |   47 
> ---
>  arch/powerpc/platforms/powernv/opal.c   |   10 +++
>  3 files changed, 54 insertions(+), 30 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 3686471..cae4921 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -123,6 +123,7 @@ MODULE_PARM_DESC(halt_poll_ns_shrink, "Factor halt poll 
> time is shrunk by");
>  
>  static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
>  static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
> +static void kvmppc_machine_check_hook(void);
>  
>  static inline struct kvm_vcpu *next_runnable_thread(struct kvmppc_vcore *vc,
>   int *ip)
> @@ -954,15 +955,14 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, 
> struct kvm_vcpu *vcpu,
>   r = RESUME_GUEST;
>   break;
>   case BOOK3S_INTERRUPT_MACHINE_CHECK:
> + /* Exit to guest with KVM_EXIT_NMI as exit reason */
> + run->exit_reason = KVM_EXIT_NMI;
> + r = RESUME_HOST;
>   /*
> -  * Deliver a machine check interrupt to the guest.
> -  * We have to do this, even if the host has handled the
> -  * machine check, because machine checks use SRR0/1 and
> -  * the interrupt might have trashed guest state in them.
> +  * Invoke host-kernel handler to perform any host-side
> +  * handling before exiting the guest.
>*/
> - kvmppc_book3s_queue_irqprio(vcpu,
> - BOOK3S_INTERRUPT_MACHINE_CHECK);
> - r = RESUME_GUEST;
> + kvmppc_machine_check_hook();
>   break;
>   case BOOK3S_INTERRUPT_PROGRAM:
>   {
> @@ -3491,6 +3491,19 @@ static void kvmppc_irq_bypass_del_producer_hv(struct 
> irq_bypass_consumer *cons,
>  }
>  #endif
>  
> +/*
> + * Hook to handle machine check exceptions occurred inside a guest.
> + * This hook is invoked from host virtual mode from KVM before exiting
> + * the guest with KVM_EXIT_NMI exit reason. This gives an opportunity
> + * for the host to take action (if any) before passing on the machine
> + * check exception to the guest kernel.
> + */
> +static void kvmppc_machine_check_hook(void)
> +{
> + if (ppc_md.machine_check_exception)
> + ppc_md.machine_check_exception(NULL);
> +}
> +
>  static long kvm_arch_vm_ioctl_hv(struct file *filp,
>unsigned int ioctl, unsigned long arg)
>  {
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index c3c1d1b..9b41390 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -134,21 +134,18 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>   stb r0,

Re: [PATCH kernel v3] KVM: PPC: Add in-kernel acceleration for VFIO

2017-01-12 Thread Alexey Kardashevskiy

On 12/01/17 16:04, David Gibson wrote:
> On Tue, Dec 20, 2016 at 05:52:29PM +1100, Alexey Kardashevskiy wrote:
>> This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT
>> and H_STUFF_TCE requests targeted an IOMMU TCE table used for VFIO
>> without passing them to user space which saves time on switching
>> to user space and back.
>>
>> This adds H_PUT_TCE/H_PUT_TCE_INDIRECT/H_STUFF_TCE handlers to KVM.
>> KVM tries to handle a TCE request in the real mode, if failed
>> it passes the request to the virtual mode to complete the operation.
>> If it a virtual mode handler fails, the request is passed to
>> the user space; this is not expected to happen though.
>>
>> To avoid dealing with page use counters (which is tricky in real mode),
>> this only accelerates SPAPR TCE IOMMU v2 clients which are required
>> to pre-register the userspace memory. The very first TCE request will
>> be handled in the VFIO SPAPR TCE driver anyway as the userspace view
>> of the TCE table (iommu_table::it_userspace) is not allocated till
>> the very first mapping happens and we cannot call vmalloc in real mode.
>>
>> This adds new attribute - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE - to
>> the VFIO KVM device. It takes a VFIO group fd and SPAPR TCE table fd
>> and associates a physical IOMMU table with the SPAPR TCE table (which
>> is a guest view of the hardware IOMMU table). The iommu_table object
>> is cached and referenced so we do not have to look up for it in real mode.
>>
>> This does not implement the UNSET counterpart as there is no use for it -
>> once the acceleration is enabled, the existing userspace won't
>> disable it unless a VFIO container is destroyed; this adds necessary
>> cleanup to the KVM_DEV_VFIO_GROUP_DEL handler.
>>
>> As this creates a descriptor per IOMMU table-LIOBN couple (called
>> kvmppc_spapr_tce_iommu_table), it is possible to have several
>> descriptors with the same iommu_table (hardware IOMMU table) attached
>> to the same LIOBN, this is done to simplify the cleanup and can be
>> improved later.
>>
>> This advertises the new KVM_CAP_SPAPR_TCE_VFIO capability to the user
>> space.
>>
>> This finally makes use of vfio_external_user_iommu_id() which was
>> introduced quite some time ago and was considered for removal.
>>
>> Tests show that this patch increases transmission speed from 220MB/s
>> to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card).
>>
>> Signed-off-by: Alexey Kardashevskiy 
>> ---
>> Changes:
>> v3:
>> * simplified not to use VFIO group notifiers
>> * reworked cleanup, should be cleaner/simpler now
>>
>> v2:
>> * reworked to use new VFIO notifiers
>> * now same iommu_table may appear in the list several times, to be fixed 
>> later
>> ---
>>
>> This obsoletes:
>>
>> [PATCH kernel v2 08/11] KVM: PPC: Pass kvm* to kvmppc_find_table()
>> [PATCH kernel v2 09/11] vfio iommu: Add helpers to (un)register blocking 
>> notifiers per group
>> [PATCH kernel v2 11/11] KVM: PPC: Add in-kernel acceleration for VFIO
>>
>>
>> So I have not reposted the whole thing, should have I?
>>
>>
>> btw "F: virt/kvm/vfio.*"  is missing MAINTAINERS.
>>
>>
>> ---
>>  Documentation/virtual/kvm/devices/vfio.txt |  22 ++-
>>  arch/powerpc/include/asm/kvm_host.h|   8 +
>>  arch/powerpc/include/asm/kvm_ppc.h |   4 +
>>  include/uapi/linux/kvm.h   |   8 +
>>  arch/powerpc/kvm/book3s_64_vio.c   | 286 
>> +
>>  arch/powerpc/kvm/book3s_64_vio_hv.c| 178 ++
>>  arch/powerpc/kvm/powerpc.c |   2 +
>>  virt/kvm/vfio.c|  88 +
>>  8 files changed, 594 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/virtual/kvm/devices/vfio.txt 
>> b/Documentation/virtual/kvm/devices/vfio.txt
>> index ef51740c67ca..f95d867168ea 100644
>> --- a/Documentation/virtual/kvm/devices/vfio.txt
>> +++ b/Documentation/virtual/kvm/devices/vfio.txt
>> @@ -16,7 +16,25 @@ Groups:
>>  
>>  KVM_DEV_VFIO_GROUP attributes:
>>KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking
>> +kvm_device_attr.addr points to an int32_t file descriptor
>> +for the VFIO group.
>>KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device tracking
>> +kvm_device_attr.addr points to an int32_t file descriptor
>> +for the VFIO group.
>> +  KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table
>> +allocated by sPAPR KVM.
>> +kvm_device_attr.addr points to a struct:
>>  
>> -For each, kvm_device_attr.addr points to an int32_t file descriptor
>> -for the VFIO group.
>> +struct kvm_vfio_spapr_tce {
>> +__u32   argsz;
>> +__u32   flags;
>> +__s32   groupfd;
>> +__s32   tablefd;
>> +};
>> +
>> +where
>> +@argsz is the size of kvm_vfio_spapr_tce_liobn;
>> +@flags are not supported now, must be zero;
>> +@groupfd is a file descriptor for a VFIO group;

88 matches

Mail list logo