Re: [PATCH v6 2/4] resource: Use list_head to link sibling resource

2018-07-07 Thread Baoquan He
Hi,

On 07/05/18 at 01:00am, kbuild test robot wrote:
> Hi Baoquan,
> 
> I love your patch! Yet something to improve:
> 
> [auto build test ERROR on linus/master]
> [also build test ERROR on v4.18-rc3 next-20180704]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]

Thanks for telling. 

I cloned 0day-ci/linut to my local pc.
https://github.com/0day-ci/linux.git

However, I didn't find below branch. And tried to open it in web
broswer, also failed.


> url:
> https://github.com/0day-ci/linux/commits/Baoquan-He/resource-Use-list_head-to-link-sibling-resource/20180704-121402
> config: mips-rb532_defconfig (attached as .config)
> compiler: mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
> reproduce:
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> GCC_VERSION=7.2.0 make.cross ARCH=mips 

I did find a old one which is for the old version 5 post.

[bhe@linux]$ git remote -v
0day-ci https://github.com/0day-ci/linux.git (fetch)
0day-ci https://github.com/0day-ci/linux.git (push)
[bhe@dhcp-128-28 linux]$ git branch -a| grep Baoquan| grep resource
  
remotes/0day-ci/Baoquan-He/resource-Use-list_head-to-link-sibling-resource/20180612-113600

Could you help have a look at this?

Thanks
Baoquan

> 
> All error/warnings (new ones prefixed by >>):
> 
> >> arch/mips/pci/pci-rc32434.c:57:11: error: initialization from incompatible 
> >> pointer type [-Werror=incompatible-pointer-types]
>  .child = _res_pci_mem2
>   ^
>arch/mips/pci/pci-rc32434.c:57:11: note: (near initialization for 
> 'rc32434_res_pci_mem1.child.next')
> >> arch/mips/pci/pci-rc32434.c:51:47: warning: missing braces around 
> >> initializer [-Wmissing-braces]
> static struct resource rc32434_res_pci_mem1 = {
>   ^
>arch/mips/pci/pci-rc32434.c:60:47: warning: missing braces around 
> initializer [-Wmissing-braces]
> static struct resource rc32434_res_pci_mem2 = {
>   ^
>cc1: some warnings being treated as errors
> 
> vim +57 arch/mips/pci/pci-rc32434.c
> 
> 73b4390f Ralf Baechle 2008-07-16  50  
> 73b4390f Ralf Baechle 2008-07-16 @51  static struct resource 
> rc32434_res_pci_mem1 = {
> 73b4390f Ralf Baechle 2008-07-16  52  .name = "PCI MEM1",
> 73b4390f Ralf Baechle 2008-07-16  53  .start = 0x5000,
> 73b4390f Ralf Baechle 2008-07-16  54  .end = 0x5FFF,
> 73b4390f Ralf Baechle 2008-07-16  55  .flags = IORESOURCE_MEM,
> 73b4390f Ralf Baechle 2008-07-16  56  .sibling = NULL,
> 73b4390f Ralf Baechle 2008-07-16 @57  .child = _res_pci_mem2
> 73b4390f Ralf Baechle 2008-07-16  58  };
> 73b4390f Ralf Baechle 2008-07-16  59  
> 
> :: The code at line 57 was first introduced by commit
> :: 73b4390fb23456964201abda79f1210fe337d01a [MIPS] Routerboard 532: 
> Support for base system
> 
> :: TO: Ralf Baechle 
> :: CC: Ralf Baechle 
> 
> ---
> 0-DAY kernel test infrastructureOpen Source Technology Center
> https://lists.01.org/pipermail/kbuild-all   Intel Corporation




Re: powerpc: 32BIT vs. 64BIT (PPC32 vs. PPC64)

2018-07-07 Thread Randy Dunlap
On 07/07/2018 05:13 AM, Nicholas Piggin wrote:
> On Fri, 6 Jul 2018 21:58:29 -0700
> Randy Dunlap  wrote:
> 
>> On 07/06/2018 06:45 PM, Benjamin Herrenschmidt wrote:
>>> On Thu, 2018-07-05 at 14:30 -0700, Randy Dunlap wrote:  
 Hi,

 Is there a good way (or a shortcut) to do something like:

 $ make ARCH=powerpc O=PPC32 [other_options] allmodconfig
   to get a PPC32/32BIT allmodconfig

 and also be able to do:

 $make ARCH=powerpc O=PPC64 [other_options] allmodconfig
   to get a PPC64/64BIT allmodconfig?  
>>>
>>> Hrm... O= is for the separate build dir, so there much be something
>>> else.
>>>
>>> You mean having ARCH= aliases like ppc/ppc32 and ppc64 ?  
>>
>> Yes.
>>
>>> That would be a matter of overriding some .config defaults I suppose, I
>>> don't know how this is done on other archs.
>>>
>>> I see the aliasing trick in the Makefile but that's about it.
>>>   
 Note that arch/x86, arch/sh, and arch/sparc have ways to do
 some flavor(s) of this (from Documentation/kbuild/kbuild.txt;
 sh and sparc based on a recent "fix" patch from me):  
>>>
>>> I fail to see what you are actually talking about here ... sorry. Do
>>> you have concrete examples on x86 or sparc ? From what I can tell the
>>> "i386" or "sparc32/sparc64" aliases just change SRCARCH in Makefile and
>>> 32 vs 64-bit is just a Kconfig option...  
>>
>> Yes, your summary is mostly correct.
>>
>> I'm just looking for a way to do cross-compile builds that are close to
>> ppc32 allmodconfig and ppc64 allmodconfig.
> 
> Would there a problem with adding ARCH=ppc32 / ppc64 matching? This
> seems to work...
> 
> Thanks,
> Nick

Yes, this mostly works and is similar to a patch (my patch) on my test machine.
And they both work for allmodconfig, which is my primary build target.

And they both have one little quirk that is confusing when the build target
is defconfig:

When ARCH=ppc32, the terminal output (stdout) is: (using O=PPC32)

make[1]: Entering directory '/home/rdunlap/lnx/lnx-418-rc3/PPC32'
  GEN ./Makefile
*** Default configuration is based on 'ppc64_defconfig'   < NOTE <
#
# configuration written to .config
#
make[1]: Leaving directory '/home/rdunlap/lnx/lnx-418-rc3/PPC32'


I expect that can be fixed also.  :)

And the written .config file is indeed for 32BIT, not 64BIT.

Thanks, Nick.

> ---
>  Makefile   | 8 
>  arch/powerpc/Kconfig   | 9 +
>  arch/powerpc/platforms/Kconfig.cputype | 8 
>  3 files changed, 17 insertions(+), 8 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index c5ce55cbc543..f97204aed17a 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -345,6 +345,14 @@ ifeq ($(ARCH),sh64)
> SRCARCH := sh
>  endif
>  
> +# Additional ARCH settings for powerpc
> +ifeq ($(ARCH),ppc32)
> +   SRCARCH := powerpc
> +endif
> +ifeq ($(ARCH),ppc64)
> +   SRCARCH := powerpc
> +endif
> +
>  KCONFIG_CONFIG   ?= .config
>  export KCONFIG_CONFIG
>  
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 9f2b75fe2c2d..3405b1b122be 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -1,4 +1,13 @@
>  # SPDX-License-Identifier: GPL-2.0
> +
> +config PPC64
> + bool "64-bit kernel" if "$(ARCH)" = "powerpc"
> + default "$(ARCH)" != "ppc32"
> + select ZLIB_DEFLATE
> + help
> +   This option selects whether a 32-bit or a 64-bit kernel
> +   will be built.
> +
>  source "arch/powerpc/platforms/Kconfig.cputype"
>  
>  config PPC32
> diff --git a/arch/powerpc/platforms/Kconfig.cputype 
> b/arch/powerpc/platforms/Kconfig.cputype
> index e6a1de521319..f6e5d6ef9782 100644
> --- a/arch/powerpc/platforms/Kconfig.cputype
> +++ b/arch/powerpc/platforms/Kconfig.cputype
> @@ -1,12 +1,4 @@
>  # SPDX-License-Identifier: GPL-2.0
> -config PPC64
> - bool "64-bit kernel"
> - default n
> - select ZLIB_DEFLATE
> - help
> -   This option selects whether a 32-bit or a 64-bit kernel
> -   will be built.
> -
>  menu "Processor support"
>  choice
>   prompt "Processor Type"
> 


-- 
~Randy


Re: [PATCH kernel v5 2/2] KVM: PPC: Check if IOMMU page is contained in the pinned physical page

2018-07-07 Thread Nicholas Piggin
On Sat, 7 Jul 2018 22:44:45 +1000
Alexey Kardashevskiy  wrote:

> On Sat, 7 Jul 2018 21:43:03 +1000
> Nicholas Piggin  wrote:
> 
> > On Sat,  7 Jul 2018 20:44:10 +1000
> > Alexey Kardashevskiy  wrote:
> >   
> > > A VM which has:
> > >  - a DMA capable device passed through to it (eg. network card);
> > >  - running a malicious kernel that ignores H_PUT_TCE failure;
> > >  - capability of using IOMMU pages bigger that physical pages
> > > can create an IOMMU mapping that exposes (for example) 16MB of
> > > the host physical memory to the device when only 64K was allocated to the 
> > > VM.
> > > 
> > > The remaining 16MB - 64K will be some other content of host memory, 
> > > possibly
> > > including pages of the VM, but also pages of host kernel memory, host
> > > programs or other VMs.
> > > 
> > > The attacking VM does not control the location of the page it can map,
> > > and is only allowed to map as many pages as it has pages of RAM.
> > > 
> > > We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that
> > > an IOMMU page is contained in the physical page so the PCI hardware won't
> > > get access to unassigned host memory; however this check is missing in
> > > the KVM fastpath (H_PUT_TCE accelerated code). We were lucky so far and
> > > did not hit this yet as the very first time when the mapping happens
> > > we do not have tbl::it_userspace allocated yet and fall back to
> > > the userspace which in turn calls VFIO IOMMU driver, this fails and
> > > the guest does not retry,
> > > 
> > > This stores the smallest preregistered page size in the preregistered
> > > region descriptor and changes the mm_iommu_xxx API to check this against
> > > the IOMMU page size. This calculates maximum page size as a minimum of
> > > the natural region alignment and hugepagetlb's compound page size.
> > > 
> > > Signed-off-by: Alexey Kardashevskiy 
> > > ---
> > > Changes:
> > > v5:
> > > * only consider compound pages from hugetlbfs
> > > 
> > > v4:
> > > * reimplemented max pageshift calculation
> > > 
> > > v3:
> > > * fixed upper limit for the page size
> > > * added checks that we don't register parts of a huge page
> > > 
> > > v2:
> > > * explicitely check for compound pages before calling compound_order()
> > > 
> > > ---
> > > The bug is: run QEMU _without_ hugepages (no -mempath) and tell it to
> > > advertise 16MB pages to the guest; a typical pseries guest will use 16MB
> > > for IOMMU pages without checking the mmu pagesize and this will fail
> > > at 
> > > https://git.qemu.org/?p=qemu.git;a=blob;f=hw/vfio/common.c;h=fb396cf00ac40eb35967a04c9cc798ca896eed57;hb=refs/heads/master#l256
> > > 
> > > With the change, mapping will fail in KVM and the guest will print:
> > > 
> > > mlx5_core :00:00.0: ibm,create-pe-dma-window(2027) 0 800 2000 
> > > 18 1f returned 0 (liobn = 0x8001 starting addr = 800 0)
> > > mlx5_core :00:00.0: created tce table LIOBN 0x8001 for 
> > > /pci@8002000/ethernet@0
> > > mlx5_core :00:00.0: failed to map direct window for 
> > > /pci@8002000/ethernet@0: -1
> > > ---
> > >  arch/powerpc/include/asm/mmu_context.h |  4 ++--
> > >  arch/powerpc/kvm/book3s_64_vio.c   |  2 +-
> > >  arch/powerpc/kvm/book3s_64_vio_hv.c|  6 --
> > >  arch/powerpc/mm/mmu_context_iommu.c| 33 
> > > +++--
> > >  drivers/vfio/vfio_iommu_spapr_tce.c|  2 +-
> > >  5 files changed, 35 insertions(+), 12 deletions(-)
> > > 
> > > diff --git a/arch/powerpc/include/asm/mmu_context.h 
> > > b/arch/powerpc/include/asm/mmu_context.h
> > > index 896efa5..79d570c 100644
> > > --- a/arch/powerpc/include/asm/mmu_context.h
> > > +++ b/arch/powerpc/include/asm/mmu_context.h
> > > @@ -35,9 +35,9 @@ extern struct mm_iommu_table_group_mem_t 
> > > *mm_iommu_lookup_rm(
> > >  extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct 
> > > *mm,
> > >   unsigned long ua, unsigned long entries);
> > >  extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
> > > - unsigned long ua, unsigned long *hpa);
> > > + unsigned long ua, unsigned int pageshift, unsigned long *hpa);
> > >  extern long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
> > > - unsigned long ua, unsigned long *hpa);
> > > + unsigned long ua, unsigned int pageshift, unsigned long *hpa);
> > >  extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
> > >  extern void mm_iommu_mapped_dec(struct mm_iommu_table_group_mem_t *mem);
> > >  #endif
> > > diff --git a/arch/powerpc/kvm/book3s_64_vio.c 
> > > b/arch/powerpc/kvm/book3s_64_vio.c
> > > index d066e37..8c456fa 100644
> > > --- a/arch/powerpc/kvm/book3s_64_vio.c
> > > +++ b/arch/powerpc/kvm/book3s_64_vio.c
> > > @@ -449,7 +449,7 @@ long kvmppc_tce_iommu_do_map(struct kvm *kvm, struct 
> > > iommu_table *tbl,
> > >   /* This only handles v2 IOMMU type, v1 is handled via ioctl() */
> > >   

Re: [PATCH kernel v5 2/2] KVM: PPC: Check if IOMMU page is contained in the pinned physical page

2018-07-07 Thread Alexey Kardashevskiy
On Sat, 7 Jul 2018 21:43:03 +1000
Nicholas Piggin  wrote:

> On Sat,  7 Jul 2018 20:44:10 +1000
> Alexey Kardashevskiy  wrote:
> 
> > A VM which has:
> >  - a DMA capable device passed through to it (eg. network card);
> >  - running a malicious kernel that ignores H_PUT_TCE failure;
> >  - capability of using IOMMU pages bigger that physical pages
> > can create an IOMMU mapping that exposes (for example) 16MB of
> > the host physical memory to the device when only 64K was allocated to the 
> > VM.
> > 
> > The remaining 16MB - 64K will be some other content of host memory, possibly
> > including pages of the VM, but also pages of host kernel memory, host
> > programs or other VMs.
> > 
> > The attacking VM does not control the location of the page it can map,
> > and is only allowed to map as many pages as it has pages of RAM.
> > 
> > We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that
> > an IOMMU page is contained in the physical page so the PCI hardware won't
> > get access to unassigned host memory; however this check is missing in
> > the KVM fastpath (H_PUT_TCE accelerated code). We were lucky so far and
> > did not hit this yet as the very first time when the mapping happens
> > we do not have tbl::it_userspace allocated yet and fall back to
> > the userspace which in turn calls VFIO IOMMU driver, this fails and
> > the guest does not retry,
> > 
> > This stores the smallest preregistered page size in the preregistered
> > region descriptor and changes the mm_iommu_xxx API to check this against
> > the IOMMU page size. This calculates maximum page size as a minimum of
> > the natural region alignment and hugepagetlb's compound page size.
> > 
> > Signed-off-by: Alexey Kardashevskiy 
> > ---
> > Changes:
> > v5:
> > * only consider compound pages from hugetlbfs
> > 
> > v4:
> > * reimplemented max pageshift calculation
> > 
> > v3:
> > * fixed upper limit for the page size
> > * added checks that we don't register parts of a huge page
> > 
> > v2:
> > * explicitely check for compound pages before calling compound_order()
> > 
> > ---
> > The bug is: run QEMU _without_ hugepages (no -mempath) and tell it to
> > advertise 16MB pages to the guest; a typical pseries guest will use 16MB
> > for IOMMU pages without checking the mmu pagesize and this will fail
> > at 
> > https://git.qemu.org/?p=qemu.git;a=blob;f=hw/vfio/common.c;h=fb396cf00ac40eb35967a04c9cc798ca896eed57;hb=refs/heads/master#l256
> > 
> > With the change, mapping will fail in KVM and the guest will print:
> > 
> > mlx5_core :00:00.0: ibm,create-pe-dma-window(2027) 0 800 2000 
> > 18 1f returned 0 (liobn = 0x8001 starting addr = 800 0)
> > mlx5_core :00:00.0: created tce table LIOBN 0x8001 for 
> > /pci@8002000/ethernet@0
> > mlx5_core :00:00.0: failed to map direct window for 
> > /pci@8002000/ethernet@0: -1
> > ---
> >  arch/powerpc/include/asm/mmu_context.h |  4 ++--
> >  arch/powerpc/kvm/book3s_64_vio.c   |  2 +-
> >  arch/powerpc/kvm/book3s_64_vio_hv.c|  6 --
> >  arch/powerpc/mm/mmu_context_iommu.c| 33 
> > +++--
> >  drivers/vfio/vfio_iommu_spapr_tce.c|  2 +-
> >  5 files changed, 35 insertions(+), 12 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/mmu_context.h 
> > b/arch/powerpc/include/asm/mmu_context.h
> > index 896efa5..79d570c 100644
> > --- a/arch/powerpc/include/asm/mmu_context.h
> > +++ b/arch/powerpc/include/asm/mmu_context.h
> > @@ -35,9 +35,9 @@ extern struct mm_iommu_table_group_mem_t 
> > *mm_iommu_lookup_rm(
> >  extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct 
> > *mm,
> > unsigned long ua, unsigned long entries);
> >  extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
> > -   unsigned long ua, unsigned long *hpa);
> > +   unsigned long ua, unsigned int pageshift, unsigned long *hpa);
> >  extern long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
> > -   unsigned long ua, unsigned long *hpa);
> > +   unsigned long ua, unsigned int pageshift, unsigned long *hpa);
> >  extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
> >  extern void mm_iommu_mapped_dec(struct mm_iommu_table_group_mem_t *mem);
> >  #endif
> > diff --git a/arch/powerpc/kvm/book3s_64_vio.c 
> > b/arch/powerpc/kvm/book3s_64_vio.c
> > index d066e37..8c456fa 100644
> > --- a/arch/powerpc/kvm/book3s_64_vio.c
> > +++ b/arch/powerpc/kvm/book3s_64_vio.c
> > @@ -449,7 +449,7 @@ long kvmppc_tce_iommu_do_map(struct kvm *kvm, struct 
> > iommu_table *tbl,
> > /* This only handles v2 IOMMU type, v1 is handled via ioctl() */
> > return H_TOO_HARD;
> >  
> > -   if (WARN_ON_ONCE(mm_iommu_ua_to_hpa(mem, ua, )))
> > +   if (WARN_ON_ONCE(mm_iommu_ua_to_hpa(mem, ua, tbl->it_page_shift, )))
> > return H_HARDWARE;
> >  
> > if (mm_iommu_mapped_inc(mem))
> > diff --git 

Re: powerpc: 32BIT vs. 64BIT (PPC32 vs. PPC64)

2018-07-07 Thread Nicholas Piggin
On Fri, 6 Jul 2018 21:58:29 -0700
Randy Dunlap  wrote:

> On 07/06/2018 06:45 PM, Benjamin Herrenschmidt wrote:
> > On Thu, 2018-07-05 at 14:30 -0700, Randy Dunlap wrote:  
> >> Hi,
> >>
> >> Is there a good way (or a shortcut) to do something like:
> >>
> >> $ make ARCH=powerpc O=PPC32 [other_options] allmodconfig
> >>   to get a PPC32/32BIT allmodconfig
> >>
> >> and also be able to do:
> >>
> >> $make ARCH=powerpc O=PPC64 [other_options] allmodconfig
> >>   to get a PPC64/64BIT allmodconfig?  
> > 
> > Hrm... O= is for the separate build dir, so there much be something
> > else.
> > 
> > You mean having ARCH= aliases like ppc/ppc32 and ppc64 ?  
> 
> Yes.
> 
> > That would be a matter of overriding some .config defaults I suppose, I
> > don't know how this is done on other archs.
> > 
> > I see the aliasing trick in the Makefile but that's about it.
> >   
> >> Note that arch/x86, arch/sh, and arch/sparc have ways to do
> >> some flavor(s) of this (from Documentation/kbuild/kbuild.txt;
> >> sh and sparc based on a recent "fix" patch from me):  
> > 
> > I fail to see what you are actually talking about here ... sorry. Do
> > you have concrete examples on x86 or sparc ? From what I can tell the
> > "i386" or "sparc32/sparc64" aliases just change SRCARCH in Makefile and
> > 32 vs 64-bit is just a Kconfig option...  
> 
> Yes, your summary is mostly correct.
> 
> I'm just looking for a way to do cross-compile builds that are close to
> ppc32 allmodconfig and ppc64 allmodconfig.

Would there a problem with adding ARCH=ppc32 / ppc64 matching? This
seems to work...

Thanks,
Nick

---
 Makefile   | 8 
 arch/powerpc/Kconfig   | 9 +
 arch/powerpc/platforms/Kconfig.cputype | 8 
 3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/Makefile b/Makefile
index c5ce55cbc543..f97204aed17a 100644
--- a/Makefile
+++ b/Makefile
@@ -345,6 +345,14 @@ ifeq ($(ARCH),sh64)
SRCARCH := sh
 endif
 
+# Additional ARCH settings for powerpc
+ifeq ($(ARCH),ppc32)
+   SRCARCH := powerpc
+endif
+ifeq ($(ARCH),ppc64)
+   SRCARCH := powerpc
+endif
+
 KCONFIG_CONFIG ?= .config
 export KCONFIG_CONFIG
 
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9f2b75fe2c2d..3405b1b122be 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -1,4 +1,13 @@
 # SPDX-License-Identifier: GPL-2.0
+
+config PPC64
+   bool "64-bit kernel" if "$(ARCH)" = "powerpc"
+   default "$(ARCH)" != "ppc32"
+   select ZLIB_DEFLATE
+   help
+ This option selects whether a 32-bit or a 64-bit kernel
+ will be built.
+
 source "arch/powerpc/platforms/Kconfig.cputype"
 
 config PPC32
diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index e6a1de521319..f6e5d6ef9782 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -1,12 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0
-config PPC64
-   bool "64-bit kernel"
-   default n
-   select ZLIB_DEFLATE
-   help
- This option selects whether a 32-bit or a 64-bit kernel
- will be built.
-
 menu "Processor support"
 choice
prompt "Processor Type"
-- 
2.17.0



Re: [PATCH kernel v5 2/2] KVM: PPC: Check if IOMMU page is contained in the pinned physical page

2018-07-07 Thread Nicholas Piggin
On Sat,  7 Jul 2018 20:44:10 +1000
Alexey Kardashevskiy  wrote:

> A VM which has:
>  - a DMA capable device passed through to it (eg. network card);
>  - running a malicious kernel that ignores H_PUT_TCE failure;
>  - capability of using IOMMU pages bigger that physical pages
> can create an IOMMU mapping that exposes (for example) 16MB of
> the host physical memory to the device when only 64K was allocated to the VM.
> 
> The remaining 16MB - 64K will be some other content of host memory, possibly
> including pages of the VM, but also pages of host kernel memory, host
> programs or other VMs.
> 
> The attacking VM does not control the location of the page it can map,
> and is only allowed to map as many pages as it has pages of RAM.
> 
> We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that
> an IOMMU page is contained in the physical page so the PCI hardware won't
> get access to unassigned host memory; however this check is missing in
> the KVM fastpath (H_PUT_TCE accelerated code). We were lucky so far and
> did not hit this yet as the very first time when the mapping happens
> we do not have tbl::it_userspace allocated yet and fall back to
> the userspace which in turn calls VFIO IOMMU driver, this fails and
> the guest does not retry,
> 
> This stores the smallest preregistered page size in the preregistered
> region descriptor and changes the mm_iommu_xxx API to check this against
> the IOMMU page size. This calculates maximum page size as a minimum of
> the natural region alignment and hugepagetlb's compound page size.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
> Changes:
> v5:
> * only consider compound pages from hugetlbfs
> 
> v4:
> * reimplemented max pageshift calculation
> 
> v3:
> * fixed upper limit for the page size
> * added checks that we don't register parts of a huge page
> 
> v2:
> * explicitely check for compound pages before calling compound_order()
> 
> ---
> The bug is: run QEMU _without_ hugepages (no -mempath) and tell it to
> advertise 16MB pages to the guest; a typical pseries guest will use 16MB
> for IOMMU pages without checking the mmu pagesize and this will fail
> at 
> https://git.qemu.org/?p=qemu.git;a=blob;f=hw/vfio/common.c;h=fb396cf00ac40eb35967a04c9cc798ca896eed57;hb=refs/heads/master#l256
> 
> With the change, mapping will fail in KVM and the guest will print:
> 
> mlx5_core :00:00.0: ibm,create-pe-dma-window(2027) 0 800 2000 18 
> 1f returned 0 (liobn = 0x8001 starting addr = 800 0)
> mlx5_core :00:00.0: created tce table LIOBN 0x8001 for 
> /pci@8002000/ethernet@0
> mlx5_core :00:00.0: failed to map direct window for 
> /pci@8002000/ethernet@0: -1
> ---
>  arch/powerpc/include/asm/mmu_context.h |  4 ++--
>  arch/powerpc/kvm/book3s_64_vio.c   |  2 +-
>  arch/powerpc/kvm/book3s_64_vio_hv.c|  6 --
>  arch/powerpc/mm/mmu_context_iommu.c| 33 +++--
>  drivers/vfio/vfio_iommu_spapr_tce.c|  2 +-
>  5 files changed, 35 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/mmu_context.h 
> b/arch/powerpc/include/asm/mmu_context.h
> index 896efa5..79d570c 100644
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/arch/powerpc/include/asm/mmu_context.h
> @@ -35,9 +35,9 @@ extern struct mm_iommu_table_group_mem_t 
> *mm_iommu_lookup_rm(
>  extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
>   unsigned long ua, unsigned long entries);
>  extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
> - unsigned long ua, unsigned long *hpa);
> + unsigned long ua, unsigned int pageshift, unsigned long *hpa);
>  extern long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
> - unsigned long ua, unsigned long *hpa);
> + unsigned long ua, unsigned int pageshift, unsigned long *hpa);
>  extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
>  extern void mm_iommu_mapped_dec(struct mm_iommu_table_group_mem_t *mem);
>  #endif
> diff --git a/arch/powerpc/kvm/book3s_64_vio.c 
> b/arch/powerpc/kvm/book3s_64_vio.c
> index d066e37..8c456fa 100644
> --- a/arch/powerpc/kvm/book3s_64_vio.c
> +++ b/arch/powerpc/kvm/book3s_64_vio.c
> @@ -449,7 +449,7 @@ long kvmppc_tce_iommu_do_map(struct kvm *kvm, struct 
> iommu_table *tbl,
>   /* This only handles v2 IOMMU type, v1 is handled via ioctl() */
>   return H_TOO_HARD;
>  
> - if (WARN_ON_ONCE(mm_iommu_ua_to_hpa(mem, ua, )))
> + if (WARN_ON_ONCE(mm_iommu_ua_to_hpa(mem, ua, tbl->it_page_shift, )))
>   return H_HARDWARE;
>  
>   if (mm_iommu_mapped_inc(mem))
> diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c 
> b/arch/powerpc/kvm/book3s_64_vio_hv.c
> index 925fc31..5b298f5 100644
> --- a/arch/powerpc/kvm/book3s_64_vio_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
> @@ -279,7 +279,8 @@ static long kvmppc_rm_tce_iommu_do_map(struct kvm 

[PATCH kernel v5 1/2] vfio/spapr: Use IOMMU pageshift rather than pagesize

2018-07-07 Thread Alexey Kardashevskiy
The size is always equal to 1 page so let's use this. Later on this will
be used for other checks which use page shifts to check the granularity
of access.

This should cause no behavioral change.

Reviewed-by: David Gibson 
Acked-by: Alex Williamson 
Signed-off-by: Alexey Kardashevskiy 
---

As Alex suggested, this should go via the ppc tree which the next patch
is going to (which is ppc-kvm).
---
 drivers/vfio/vfio_iommu_spapr_tce.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c 
b/drivers/vfio/vfio_iommu_spapr_tce.c
index 759a5bd..2da5f05 100644
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -457,13 +457,13 @@ static void tce_iommu_unuse_page(struct tce_container 
*container,
 }
 
 static int tce_iommu_prereg_ua_to_hpa(struct tce_container *container,
-   unsigned long tce, unsigned long size,
+   unsigned long tce, unsigned long shift,
unsigned long *phpa, struct mm_iommu_table_group_mem_t **pmem)
 {
long ret = 0;
struct mm_iommu_table_group_mem_t *mem;
 
-   mem = mm_iommu_lookup(container->mm, tce, size);
+   mem = mm_iommu_lookup(container->mm, tce, 1ULL << shift);
if (!mem)
return -EINVAL;
 
@@ -487,7 +487,7 @@ static void tce_iommu_unuse_page_v2(struct tce_container 
*container,
if (!pua)
return;
 
-   ret = tce_iommu_prereg_ua_to_hpa(container, *pua, IOMMU_PAGE_SIZE(tbl),
+   ret = tce_iommu_prereg_ua_to_hpa(container, *pua, tbl->it_page_shift,
, );
if (ret)
pr_debug("%s: tce %lx at #%lx was not cached, ret=%d\n",
@@ -611,7 +611,7 @@ static long tce_iommu_build_v2(struct tce_container 
*container,
entry + i);
 
ret = tce_iommu_prereg_ua_to_hpa(container,
-   tce, IOMMU_PAGE_SIZE(tbl), , );
+   tce, tbl->it_page_shift, , );
if (ret)
break;
 
-- 
2.11.0



[PATCH kernel v5 2/2] KVM: PPC: Check if IOMMU page is contained in the pinned physical page

2018-07-07 Thread Alexey Kardashevskiy
A VM which has:
 - a DMA capable device passed through to it (eg. network card);
 - running a malicious kernel that ignores H_PUT_TCE failure;
 - capability of using IOMMU pages bigger that physical pages
can create an IOMMU mapping that exposes (for example) 16MB of
the host physical memory to the device when only 64K was allocated to the VM.

The remaining 16MB - 64K will be some other content of host memory, possibly
including pages of the VM, but also pages of host kernel memory, host
programs or other VMs.

The attacking VM does not control the location of the page it can map,
and is only allowed to map as many pages as it has pages of RAM.

We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that
an IOMMU page is contained in the physical page so the PCI hardware won't
get access to unassigned host memory; however this check is missing in
the KVM fastpath (H_PUT_TCE accelerated code). We were lucky so far and
did not hit this yet as the very first time when the mapping happens
we do not have tbl::it_userspace allocated yet and fall back to
the userspace which in turn calls VFIO IOMMU driver, this fails and
the guest does not retry,

This stores the smallest preregistered page size in the preregistered
region descriptor and changes the mm_iommu_xxx API to check this against
the IOMMU page size. This calculates maximum page size as a minimum of
the natural region alignment and hugepagetlb's compound page size.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v5:
* only consider compound pages from hugetlbfs

v4:
* reimplemented max pageshift calculation

v3:
* fixed upper limit for the page size
* added checks that we don't register parts of a huge page

v2:
* explicitely check for compound pages before calling compound_order()

---
The bug is: run QEMU _without_ hugepages (no -mempath) and tell it to
advertise 16MB pages to the guest; a typical pseries guest will use 16MB
for IOMMU pages without checking the mmu pagesize and this will fail
at 
https://git.qemu.org/?p=qemu.git;a=blob;f=hw/vfio/common.c;h=fb396cf00ac40eb35967a04c9cc798ca896eed57;hb=refs/heads/master#l256

With the change, mapping will fail in KVM and the guest will print:

mlx5_core :00:00.0: ibm,create-pe-dma-window(2027) 0 800 2000 18 1f 
returned 0 (liobn = 0x8001 starting addr = 800 0)
mlx5_core :00:00.0: created tce table LIOBN 0x8001 for 
/pci@8002000/ethernet@0
mlx5_core :00:00.0: failed to map direct window for 
/pci@8002000/ethernet@0: -1
---
 arch/powerpc/include/asm/mmu_context.h |  4 ++--
 arch/powerpc/kvm/book3s_64_vio.c   |  2 +-
 arch/powerpc/kvm/book3s_64_vio_hv.c|  6 --
 arch/powerpc/mm/mmu_context_iommu.c| 33 +++--
 drivers/vfio/vfio_iommu_spapr_tce.c|  2 +-
 5 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index 896efa5..79d570c 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -35,9 +35,9 @@ extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup_rm(
 extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
unsigned long ua, unsigned long entries);
 extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
-   unsigned long ua, unsigned long *hpa);
+   unsigned long ua, unsigned int pageshift, unsigned long *hpa);
 extern long mm_iommu_ua_to_hpa_rm(struct mm_iommu_table_group_mem_t *mem,
-   unsigned long ua, unsigned long *hpa);
+   unsigned long ua, unsigned int pageshift, unsigned long *hpa);
 extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
 extern void mm_iommu_mapped_dec(struct mm_iommu_table_group_mem_t *mem);
 #endif
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index d066e37..8c456fa 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -449,7 +449,7 @@ long kvmppc_tce_iommu_do_map(struct kvm *kvm, struct 
iommu_table *tbl,
/* This only handles v2 IOMMU type, v1 is handled via ioctl() */
return H_TOO_HARD;
 
-   if (WARN_ON_ONCE(mm_iommu_ua_to_hpa(mem, ua, )))
+   if (WARN_ON_ONCE(mm_iommu_ua_to_hpa(mem, ua, tbl->it_page_shift, )))
return H_HARDWARE;
 
if (mm_iommu_mapped_inc(mem))
diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c 
b/arch/powerpc/kvm/book3s_64_vio_hv.c
index 925fc31..5b298f5 100644
--- a/arch/powerpc/kvm/book3s_64_vio_hv.c
+++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
@@ -279,7 +279,8 @@ static long kvmppc_rm_tce_iommu_do_map(struct kvm *kvm, 
struct iommu_table *tbl,
if (!mem)
return H_TOO_HARD;
 
-   if (WARN_ON_ONCE_RM(mm_iommu_ua_to_hpa_rm(mem, ua, )))
+   if (WARN_ON_ONCE_RM(mm_iommu_ua_to_hpa_rm(mem, ua, tbl->it_page_shift,
+   )))

[PATCH kernel v5 0/2] KVM: PPC: Check if IOMMU page is contained in the pinned physical page

2018-07-07 Thread Alexey Kardashevskiy
This is to improve page boundaries checking and should probably
be cc:stable. I came accross this while debugging nvlink2 passthrough
but the lack of checking might be exploited by the existing userspace.

The get_user_pages() comment says it should be "phased out" but the only
alternative seems to be get_user_pages_longterm(), should that be used
instead (this is longterm reference elevation, however it is not DAX,
whatever this implies)? get_user_pages_remote() seems unnecessarily
complicated because of @locked.


Changes:
v5:
* 2/2: changed compound pages handling

v4:
* 2/2: implemented less strict but still safe max pageshift as David suggested

v3:
* enforced huge pages not to cross preregistered chunk boundaries

v2:
* 2/2: explicitly check for compound pages before calling compound_order()



This is based on sha1
021c917 Linus Torvalds "Linux 4.18-rc3".

Please comment. Thanks.



Alexey Kardashevskiy (2):
  vfio/spapr: Use IOMMU pageshift rather than pagesize
  KVM: PPC: Check if IOMMU page is contained in the pinned physical page

 arch/powerpc/include/asm/mmu_context.h |  4 ++--
 arch/powerpc/kvm/book3s_64_vio.c   |  2 +-
 arch/powerpc/kvm/book3s_64_vio_hv.c|  6 --
 arch/powerpc/mm/mmu_context_iommu.c| 33 +++--
 drivers/vfio/vfio_iommu_spapr_tce.c| 10 +-
 5 files changed, 39 insertions(+), 16 deletions(-)

-- 
2.11.0



[PATCH] KVM: PPC: Book3S HV: fix constant size warning

2018-07-07 Thread Nicholas Mc Guire
 The constants are 64bit but not explicitly declared UL resulting
in sparse warnings. Fixed by declaring the constants UL.

Signed-off-by: Nicholas Mc Guire 
---

sparse fallout from compile checking book3s_hv.c:
arch/powerpc/kvm/book3s_hv.c:141:9: warning: constant 0x164520C62609AECA is so 
big it is long
arch/powerpc/kvm/book3s_hv.c:142:9: warning: constant 0x164520C62609AECA is so 
big it is long
arch/powerpc/kvm/book3s_hv.c:143:9: warning: constant 0x7FFF2908450D8DA9 is so 
big it is long
arch/powerpc/kvm/book3s_hv.c:144:9: warning: constant 0x164520C62609AECA is so 
big it is long
arch/powerpc/kvm/book3s_hv.c:145:9: warning: constant 0x199A421245058DA9 is so 
big it is long
arch/powerpc/kvm/book3s_hv.c:146:9: warning: constant 0x164520C62609AECA is so 
big it is long
arch/powerpc/kvm/book3s_hv.c:147:9: warning: constant 0x164520C62609AECA is so 
big it is long
arch/powerpc/kvm/book3s_hv.c:148:9: warning: constant 0x164520C62609AECA is so 
big it is long
arch/powerpc/kvm/book3s_hv.c:149:9: warning: constant 0x164520C62609AECA is so 
big it is long
arch/powerpc/kvm/book3s_hv.c:1667:60: warning: constant 0xf3ff is 
so big it is unsigned long
arch/powerpc/kvm/book3s_hv.c:2896:42: warning: constant 0x164520C62609AECA is 
so big it is long

most of the warnings are local constants in book3s_hv.c, PSSCR_GUEST_VIS
from reg.h is not. Although PSSCR_GUEST_VIS is not #ifdef CONFIG_PPC_BOOK3S
it seems only to be used in that particular platform (check with cscope).

The constants fit in the target variables so the size declaration 
discrepancy does not actually cause a problem here.

Patch was compiletested with: ppc64_defconfig (implies
CONFIG_KVM=y, CONFIG_KVM_BOOK3S_64_HV=m)

Patch is against 4.18-rc3 (localversion-next is next-20180705)

 arch/powerpc/include/asm/reg.h |  2 +-
 arch/powerpc/kvm/book3s_hv.c   | 16 
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 5625684..858aa79 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -161,7 +161,7 @@
 #define PSSCR_ESL  0x0020 /* Enable State Loss */
 #define PSSCR_SD   0x0040 /* Status Disable */
 #define PSSCR_PLS  0xf000 /* Power-saving Level Status */
-#define PSSCR_GUEST_VIS0xf3ff /* Guest-visible PSSCR 
fields */
+#define PSSCR_GUEST_VIS0xf3ffUL /* Guest-visible PSSCR 
fields */
 #define PSSCR_FAKE_SUSPEND 0x0400 /* Fake-suspend bit (P9 DD2.2) */
 #define PSSCR_FAKE_SUSPEND_LG  10 /* Fake-suspend bit position */
 
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ee4a885..c517859 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -128,14 +128,14 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu 
*vcpu);
  * and SPURR count and should be set according to the number of
  * online threads in the vcore being run.
  */
-#define RWMR_RPA_P8_1THREAD0x164520C62609AECA
-#define RWMR_RPA_P8_2THREAD0x7FFF2908450D8DA9
-#define RWMR_RPA_P8_3THREAD0x164520C62609AECA
-#define RWMR_RPA_P8_4THREAD0x199A421245058DA9
-#define RWMR_RPA_P8_5THREAD0x164520C62609AECA
-#define RWMR_RPA_P8_6THREAD0x164520C62609AECA
-#define RWMR_RPA_P8_7THREAD0x164520C62609AECA
-#define RWMR_RPA_P8_8THREAD0x164520C62609AECA
+#define RWMR_RPA_P8_1THREAD0x164520C62609AECAUL
+#define RWMR_RPA_P8_2THREAD0x7FFF2908450D8DA9UL
+#define RWMR_RPA_P8_3THREAD0x164520C62609AECAUL
+#define RWMR_RPA_P8_4THREAD0x199A421245058DA9UL
+#define RWMR_RPA_P8_5THREAD0x164520C62609AECAUL
+#define RWMR_RPA_P8_6THREAD0x164520C62609AECAUL
+#define RWMR_RPA_P8_7THREAD0x164520C62609AECAUL
+#define RWMR_RPA_P8_8THREAD0x164520C62609AECAUL
 
 static unsigned long p8_rwmr_values[MAX_SMT_THREADS + 1] = {
RWMR_RPA_P8_1THREAD,
-- 
2.1.4



[PATCH] KVM: PPC: Book3S HV: add of_node_put() in success path

2018-07-07 Thread Nicholas Mc Guire
 The call to of_find_compatible_node() is returning a pointer with
incremented refcount so it must be explicitly decremented after the
last use. As here it is only being used for checking of node presence
but the result is not actually used in the success path it can be
dropped immediately.

Signed-off-by: Nicholas Mc Guire 
Fixes: commit f725758b899f ("KVM: PPC: Book3S HV: Use OPAL XICS emulation on 
POWER9")
---
Problem found by experimental coccinelle script

Patch was compiletested with: ppc64_defconfig (implies CONFIG_KVM=y,
CONFIG_KVM_BOOK3S_64_HV=m)
with many sparse warnings though not related to the proposed change

Patch is against 4.18-rc3 (localversion-next is next-20180705)

 arch/powerpc/kvm/book3s_hv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ee4a885..8680fb9 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4561,6 +4561,8 @@ static int kvmppc_book3s_init_hv(void)
pr_err("KVM-HV: Cannot determine method for accessing 
XICS\n");
return -ENODEV;
}
+   /* presence of intc confirmed - node can be dropped again */
+   of_node_put(np);
}
 #endif
 
-- 
2.1.4