[PATCH kernel v2 2/2] powerpc/mm/iommu: Put pages on process exit

2016-07-19 Thread Alexey Kardashevskiy
albir Singh <bsinghar...@gmail.com> Cc: Nicholas Piggin <npig...@gmail.com> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- arch/powerpc/include/asm/mmu_context.h | 1 - arch/powerpc/mm/mmu_context_book3s64.c | 4 --- arch/powerpc/mm/mmu_context_iommu.c| 10 --- driv

[PATCH kernel v2 1/2] powerpc/iommu: Stop using @current in mm_iommu_xxx

2016-07-19 Thread Alexey Kardashevskiy
ive mm_struct instead of using one from @current. This is needed by the following patch to do proper cleanup in time. This depends on "powerpc/powernv/ioda: Fix endianness when reading TCEs" to do proper cleanup via tce_iommu_clear() patch. This should cause no behavioral change. Signed-o

[PATCH kernel v2 2/2] powerpc/mm/iommu: Put pages on process exit

2016-07-19 Thread Alexey Kardashevskiy
. As tce_iommu_register_pages/tce_iommu_unregister_pages are called under container->lock, this does not need additional locking. Cc: David Gibson Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Balbir Singh Cc: Nicholas Piggin Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/incl

[PATCH kernel v2 0/2] powerpc/mm/iommu: Put pages on process exit

2016-07-19 Thread Alexey Kardashevskiy
This is a fix to a bug when guest memory stays Active after QEMU process exited. This happened because the QEMU memory context was not released in a short period of time after QEMU process exited. More details are in the commit logs. Please comment. Thanks. Alexey Kardashevskiy (2): powerpc

[PATCH kernel v2 0/2] powerpc/mm/iommu: Put pages on process exit

2016-07-19 Thread Alexey Kardashevskiy
This is a fix to a bug when guest memory stays Active after QEMU process exited. This happened because the QEMU memory context was not released in a short period of time after QEMU process exited. More details are in the commit logs. Please comment. Thanks. Alexey Kardashevskiy (2): powerpc

Re: [PATCH kernel] powerpc/mm/iommu: Put pages on process exit

2016-07-14 Thread Alexey Kardashevskiy
On 14/07/16 21:52, Alexey Kardashevskiy wrote: > On 14/07/16 20:12, Balbir Singh wrote: >> On Thu, Jul 14, 2016 at 3:16 PM, Alexey Kardashevskiy <a...@ozlabs.ru> wrote: >>> At the moment VFIO IOMMU SPAPR v2 driver pins all guest RAM pages when >>> the

Re: [PATCH kernel] powerpc/mm/iommu: Put pages on process exit

2016-07-14 Thread Alexey Kardashevskiy
On 14/07/16 21:52, Alexey Kardashevskiy wrote: > On 14/07/16 20:12, Balbir Singh wrote: >> On Thu, Jul 14, 2016 at 3:16 PM, Alexey Kardashevskiy wrote: >>> At the moment VFIO IOMMU SPAPR v2 driver pins all guest RAM pages when >>> the userspace starts using VFIO.

Re: [PATCH kernel] powerpc/mm/iommu: Put pages on process exit

2016-07-14 Thread Alexey Kardashevskiy
On 14/07/16 20:12, Balbir Singh wrote: > On Thu, Jul 14, 2016 at 3:16 PM, Alexey Kardashevskiy <a...@ozlabs.ru> wrote: >> At the moment VFIO IOMMU SPAPR v2 driver pins all guest RAM pages when >> the userspace starts using VFIO. When the userspace process finishes, >>

Re: [PATCH kernel] powerpc/mm/iommu: Put pages on process exit

2016-07-14 Thread Alexey Kardashevskiy
On 14/07/16 20:12, Balbir Singh wrote: > On Thu, Jul 14, 2016 at 3:16 PM, Alexey Kardashevskiy wrote: >> At the moment VFIO IOMMU SPAPR v2 driver pins all guest RAM pages when >> the userspace starts using VFIO. When the userspace process finishes, >> all the pinned

[PATCH kernel] powerpc/mm/iommu: Put pages on process exit

2016-07-13 Thread Alexey Kardashevskiy
Cc: Benjamin Herrenschmidt <b...@kernel.crashing.org> Cc: Paul Mackerras <pau...@samba.org> Cc: Balbir Singh <bsinghar...@gmail.com> Cc: Nick Piggin <npig...@kernel.dk> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- arch/powerpc/include/asm/mmu_context.h | 3 +++

[PATCH kernel] powerpc/mm/iommu: Put pages on process exit

2016-07-13 Thread Alexey Kardashevskiy
kerras Cc: Balbir Singh Cc: Nick Piggin Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/include/asm/mmu_context.h | 3 +++ arch/powerpc/mm/mmu_context_book3s64.c | 4 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/i

Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-06 Thread Alexey Kardashevskiy
On 05/06/2016 01:05 AM, Alex Williamson wrote: On Thu, 5 May 2016 12:15:46 + "Tian, Kevin" wrote: From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com] Sent: Thursday, May 05, 2016 7:43 PM Hi David and Kevin, On 2016/5/5 17:54, David Laight wrote: From: Tian, Kevin

Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

2016-05-06 Thread Alexey Kardashevskiy
On 05/06/2016 01:05 AM, Alex Williamson wrote: On Thu, 5 May 2016 12:15:46 + "Tian, Kevin" wrote: From: Yongji Xie [mailto:xyj...@linux.vnet.ibm.com] Sent: Thursday, May 05, 2016 7:43 PM Hi David and Kevin, On 2016/5/5 17:54, David Laight wrote: From: Tian, Kevin Sent: 05 May 2016

Re: [PATCH 4/5] pci-ioda: Set PCI_BUS_FLAGS_MSI_REMAP for IODA host bridge

2016-05-06 Thread Alexey Kardashevskiy
On 04/27/2016 10:43 PM, Yongji Xie wrote: Any IODA host bridge have the capability of IRQ remapping. So we set PCI_BUS_FLAGS_MSI_REMAP when this kind of host birdge is detected. Signed-off-by: Yongji Xie <xyj...@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <a...@

Re: [PATCH 4/5] pci-ioda: Set PCI_BUS_FLAGS_MSI_REMAP for IODA host bridge

2016-05-06 Thread Alexey Kardashevskiy
On 04/27/2016 10:43 PM, Yongji Xie wrote: Any IODA host bridge have the capability of IRQ remapping. So we set PCI_BUS_FLAGS_MSI_REMAP when this kind of host birdge is detected. Signed-off-by: Yongji Xie Reviewed-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/pci-ioda.c

Re: [PATCH kernel v4 11/11] powerpc/powernv/npu: Enable NVLink pass through

2016-05-04 Thread Alexey Kardashevskiy
On 05/04/2016 12:08 AM, Alistair Popple wrote: Hi Alexey, On Fri, 29 Apr 2016 18:55:24 Alexey Kardashevskiy wrote: IBM POWER8 NVlink systems come with Tesla K40-ish GPUs each of which also has a couple of fast speed links (NVLink). The interface to links is exposed as an emulated PCI bridge

Re: [PATCH kernel v4 11/11] powerpc/powernv/npu: Enable NVLink pass through

2016-05-04 Thread Alexey Kardashevskiy
On 05/04/2016 12:08 AM, Alistair Popple wrote: Hi Alexey, On Fri, 29 Apr 2016 18:55:24 Alexey Kardashevskiy wrote: IBM POWER8 NVlink systems come with Tesla K40-ish GPUs each of which also has a couple of fast speed links (NVLink). The interface to links is exposed as an emulated PCI bridge

Re: [PATCH kernel v4 10/11] powerpc/powernv/npu: Rework TCE Kill handling

2016-05-04 Thread Alexey Kardashevskiy
On 05/03/2016 05:37 PM, Alistair Popple wrote: On Fri, 29 Apr 2016 18:55:23 Alexey Kardashevskiy wrote: The pnv_ioda_pe struct keeps an array of peers. At the moment it is only used to link GPU and NPU for 2 purposes: 1. Access NPU quickly when configuring DMA for GPU - this was addressed

Re: [PATCH kernel v4 10/11] powerpc/powernv/npu: Rework TCE Kill handling

2016-05-04 Thread Alexey Kardashevskiy
On 05/03/2016 05:37 PM, Alistair Popple wrote: On Fri, 29 Apr 2016 18:55:23 Alexey Kardashevskiy wrote: The pnv_ioda_pe struct keeps an array of peers. At the moment it is only used to link GPU and NPU for 2 purposes: 1. Access NPU quickly when configuring DMA for GPU - this was addressed

[PATCH kernel v4 06/11] powerpc/powernv/npu: Use the correct IOMMU page size

2016-04-29 Thread Alexey Kardashevskiy
This uses the page size from iommu_table instead of hard-coded 4K. This should cause no change in behavior. While we are here, move bits around to prepare for further rework which will define and use iommu_table_group_ops. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> Reviewed-by:

[PATCH kernel v4 06/11] powerpc/powernv/npu: Use the correct IOMMU page size

2016-04-29 Thread Alexey Kardashevskiy
This uses the page size from iommu_table instead of hard-coded 4K. This should cause no change in behavior. While we are here, move bits around to prepare for further rework which will define and use iommu_table_group_ops. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson Reviewed

[PATCH kernel v4 09/11] powerpc/powernv/npu: Add set/unset window helpers

2016-04-29 Thread Alexey Kardashevskiy
the hardware. This does not make difference now as the caller - pnv_npu_dma_set_bypass() - enables bypass in the hardware but the next patch will use it to manage TCE table lists for TCE Kill handling. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- arch/powerpc/platforms/powernv/npu

[PATCH kernel v4 09/11] powerpc/powernv/npu: Add set/unset window helpers

2016-04-29 Thread Alexey Kardashevskiy
the hardware. This does not make difference now as the caller - pnv_npu_dma_set_bypass() - enables bypass in the hardware but the next patch will use it to manage TCE table lists for TCE Kill handling. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/npu-dma.c | 65

[PATCH kernel v4 10/11] powerpc/powernv/npu: Rework TCE Kill handling

2016-04-29 Thread Alexey Kardashevskiy
as they are not needed anymore. While we are here, add TCE cache invalidation after enabling bypass. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- Changes: v4: * reworked as "powerpc/powernv/npu: Add set/unset window helpers" has been added --- arch/powerpc/platforms/powernv

[PATCH kernel v4 10/11] powerpc/powernv/npu: Rework TCE Kill handling

2016-04-29 Thread Alexey Kardashevskiy
as they are not needed anymore. While we are here, add TCE cache invalidation after enabling bypass. Signed-off-by: Alexey Kardashevskiy --- Changes: v4: * reworked as "powerpc/powernv/npu: Add set/unset window helpers" has been added --- arch/powerpc/platforms/powernv/npu-d

[PATCH kernel v4 00/11] powerpc/powernv/npu: Enable PCI pass through for NVLink

2016-04-29 Thread Alexey Kardashevskiy
r: Relax the IOMMU compatibility check" to proceed. Alexey Kardashevskiy (11): vfio_pci: Test for extended capabilities if config space > 256 bytes vfio/spapr: Relax the IOMMU compatibility check powerpc/powernv: Rename pnv_pci_ioda2_tce_invalidate_entire powerpc/powernv: Define

[PATCH kernel v4 07/11] powerpc/powernv/npu: Simplify DMA setup

2016-04-29 Thread Alexey Kardashevskiy
is called on GPU and that will do the NPU DMA configuration. This removes phb->dma_dev_setup initialization for NPU as pnv_pci_ioda_dma_dev_setup is no-op for it anyway. This stops using npe->tce_bypass_base as it never changes and values other than zero are not supported. Signed-off-by: Alex

[PATCH kernel v4 00/11] powerpc/powernv/npu: Enable PCI pass through for NVLink

2016-04-29 Thread Alexey Kardashevskiy
r: Relax the IOMMU compatibility check" to proceed. Alexey Kardashevskiy (11): vfio_pci: Test for extended capabilities if config space > 256 bytes vfio/spapr: Relax the IOMMU compatibility check powerpc/powernv: Rename pnv_pci_ioda2_tce_invalidate_entire powerpc/powernv: Define

[PATCH kernel v4 07/11] powerpc/powernv/npu: Simplify DMA setup

2016-04-29 Thread Alexey Kardashevskiy
is called on GPU and that will do the NPU DMA configuration. This removes phb->dma_dev_setup initialization for NPU as pnv_pci_ioda_dma_dev_setup is no-op for it anyway. This stops using npe->tce_bypass_base as it never changes and values other than zero are not supported. Signed-off-by: Alex

[PATCH kernel v4 08/11] powerpc/powernv/ioda2: Export debug helper pe_level_printk()

2016-04-29 Thread Alexey Kardashevskiy
This exports debugging helper pe_level_printk() and corresponding macroses so they can be used in npu-dma.c. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- arch/powerpc/platforms/powernv/pci-ioda.c | 9 + arch/powerpc/platforms/powernv/pci.h | 9 + 2 files c

[PATCH kernel v4 08/11] powerpc/powernv/ioda2: Export debug helper pe_level_printk()

2016-04-29 Thread Alexey Kardashevskiy
This exports debugging helper pe_level_printk() and corresponding macroses so they can be used in npu-dma.c. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/pci-ioda.c | 9 + arch/powerpc/platforms/powernv/pci.h | 9 + 2 files changed, 10 insertions

[PATCH kernel v4 11/11] powerpc/powernv/npu: Enable NVLink pass through

2016-04-29 Thread Alexey Kardashevskiy
-NPU IOMMU group, this makes the helpers public and adds the DMA window number parameter. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- Changes: v4: * reused pnv_npu_set_window/pnv_npu_unset_window where possible * added comments, changed commit log v3: * moved NPU-to-GPU IOMMU grou

[PATCH kernel v4 11/11] powerpc/powernv/npu: Enable NVLink pass through

2016-04-29 Thread Alexey Kardashevskiy
-NPU IOMMU group, this makes the helpers public and adds the DMA window number parameter. Signed-off-by: Alexey Kardashevskiy --- Changes: v4: * reused pnv_npu_set_window/pnv_npu_unset_window where possible * added comments, changed commit log v3: * moved NPU-to-GPU IOMMU grouping later afte

[PATCH kernel v4 04/11] powerpc/powernv: Define TCE Kill flags

2016-04-29 Thread Alexey Kardashevskiy
This replaces magic constants for TCE Kill IODA2 register with macros. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> Reviewed-by: David Gibson <da...@gibson.dropbear.id.au> --- arch/powerpc/platforms/powernv/pci-ioda.c | 7 +-- 1 file changed, 5 insertions(+), 2 deleti

[PATCH kernel v4 04/11] powerpc/powernv: Define TCE Kill flags

2016-04-29 Thread Alexey Kardashevskiy
This replaces magic constants for TCE Kill IODA2 register with macros. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson --- arch/powerpc/platforms/powernv/pci-ioda.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci

[PATCH kernel v4 01/11] vfio_pci: Test for extended capabilities if config space > 256 bytes

2016-04-29 Thread Alexey Kardashevskiy
ies if the discovered config size is more than 256 bytes. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- Changes: v2: * instead of checking for 0x, this only does the check if device's config size is big enough --- drivers/vfio/pci/vfio_pci_config.c | 17 +++-- 1 file ch

[PATCH kernel v4 02/11] vfio/spapr: Relax the IOMMU compatibility check

2016-04-29 Thread Alexey Kardashevskiy
We are going to have multiple different types of PHB on the same system with POWER8 + NVLink and PHBs will have different IOMMU ops. However we only really care about one callback - create_table - so we can relax the compatibility check here. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs

[PATCH kernel v4 01/11] vfio_pci: Test for extended capabilities if config space > 256 bytes

2016-04-29 Thread Alexey Kardashevskiy
ies if the discovered config size is more than 256 bytes. Signed-off-by: Alexey Kardashevskiy --- Changes: v2: * instead of checking for 0x, this only does the check if device's config size is big enough --- drivers/vfio/pci/vfio_pci_config.c | 17 +++-- 1 file changed, 11 insertions(+)

[PATCH kernel v4 02/11] vfio/spapr: Relax the IOMMU compatibility check

2016-04-29 Thread Alexey Kardashevskiy
We are going to have multiple different types of PHB on the same system with POWER8 + NVLink and PHBs will have different IOMMU ops. However we only really care about one callback - create_table - so we can relax the compatibility check here. Signed-off-by: Alexey Kardashevskiy Reviewed

[PATCH kernel v4 05/11] powerpc/powernv/npu: TCE Kill helpers cleanup

2016-04-29 Thread Alexey Kardashevskiy
the entire cache, this uses pnv_pci_ioda2_tce_invalidate_entire() directly for NPU. This adds an explicit comment for workaround for invalidating NPU TCE cache. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> Reviewed-by: David Gibson <da...@gibson.dropbear.id.au> Reviewed-by: Alistair P

[PATCH kernel v4 05/11] powerpc/powernv/npu: TCE Kill helpers cleanup

2016-04-29 Thread Alexey Kardashevskiy
the entire cache, this uses pnv_pci_ioda2_tce_invalidate_entire() directly for NPU. This adds an explicit comment for workaround for invalidating NPU TCE cache. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson Reviewed-by: Alistair Popple --- arch/powerpc/platforms/powernv/npu-dma.c | 41

[PATCH kernel v4 03/11] powerpc/powernv: Rename pnv_pci_ioda2_tce_invalidate_entire

2016-04-29 Thread Alexey Kardashevskiy
As in fact pnv_pci_ioda2_tce_invalidate_entire() invalidates TCEs for the specific PE rather than the entire cache, rename it to pnv_pci_ioda2_tce_invalidate_pe(). In later patches we will add a proper pnv_pci_ioda2_tce_invalidate_entire(). Signed-off-by: Alexey Kardashevskiy <a...@ozlabs

[PATCH kernel v4 03/11] powerpc/powernv: Rename pnv_pci_ioda2_tce_invalidate_entire

2016-04-29 Thread Alexey Kardashevskiy
As in fact pnv_pci_ioda2_tce_invalidate_entire() invalidates TCEs for the specific PE rather than the entire cache, rename it to pnv_pci_ioda2_tce_invalidate_pe(). In later patches we will add a proper pnv_pci_ioda2_tce_invalidate_entire(). Signed-off-by: Alexey Kardashevskiy Reviewed-by: David

[PATCH kernel v2] vfio_pci: Test for extended capabilities if config space > 256 bytes

2016-04-28 Thread Alexey Kardashevskiy
ies if the discovered config size is more than 256 bytes. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- Changes: v2: * instead of checking for 0x, this only does the check if device's config size is big enough --- drivers/vfio/pci/vfio_pci_config.c | 17 +++-- 1 file ch

[PATCH kernel v2] vfio_pci: Test for extended capabilities if config space > 256 bytes

2016-04-28 Thread Alexey Kardashevskiy
ies if the discovered config size is more than 256 bytes. Signed-off-by: Alexey Kardashevskiy --- Changes: v2: * instead of checking for 0x, this only does the check if device's config size is big enough --- drivers/vfio/pci/vfio_pci_config.c | 17 +++-- 1 file changed, 11 insertions(+)

[PATCH kernel] vfio_pci: Make extended capabilities test more robust

2016-04-28 Thread Alexey Kardashevskiy
out extended config space. This adds an additional check that config space read returned non-zero and non- value. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- drivers/vfio/pci/vfio_pci_config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers

[PATCH kernel] vfio_pci: Make extended capabilities test more robust

2016-04-28 Thread Alexey Kardashevskiy
out extended config space. This adds an additional check that config space read returned non-zero and non- value. Signed-off-by: Alexey Kardashevskiy --- drivers/vfio/pci/vfio_pci_config.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/vfio/pci/vfio_pci_confi

Re: [RFC v6 04/10] PCI: Add support for enforcing all MMIO BARs to be page aligned

2016-04-25 Thread Alexey Kardashevskiy
On 04/18/2016 08:56 PM, Yongji Xie wrote: When vfio passthrough a PCI device of which MMIO BARs are smaller than PAGE_SIZE, guest will not handle the mmio accesses to the BARs which leads to mmio emulations in host. This is because vfio will not allow to passthrough one BAR's mmio page which

Re: [RFC v6 04/10] PCI: Add support for enforcing all MMIO BARs to be page aligned

2016-04-25 Thread Alexey Kardashevskiy
On 04/18/2016 08:56 PM, Yongji Xie wrote: When vfio passthrough a PCI device of which MMIO BARs are smaller than PAGE_SIZE, guest will not handle the mmio accesses to the BARs which leads to mmio emulations in host. This is because vfio will not allow to passthrough one BAR's mmio page which

Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-19 Thread Alexey Kardashevskiy
On 03/16/2016 08:45 PM, Or Gerlitz wrote: On Wed, Mar 16, 2016 at 10:34 AM, Alexey Kardashevskiy <a...@ozlabs.ru> wrote: Oh. ok. It also looks like even with the reverted patch, mlx4 VF does not work in a guest: So where is the breakage point for you? does 4.4 works? if not, what?

Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-19 Thread Alexey Kardashevskiy
On 03/16/2016 08:45 PM, Or Gerlitz wrote: On Wed, Mar 16, 2016 at 10:34 AM, Alexey Kardashevskiy wrote: Oh. ok. It also looks like even with the reverted patch, mlx4 VF does not work in a guest: So where is the breakage point for you? does 4.4 works? if not, what? Ah, my bad

Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-16 Thread Alexey Kardashevskiy
On 03/16/2016 05:09 PM, Eli Cohen wrote: On Wed, Mar 16, 2016 at 04:49:00PM +1100, Alexey Kardashevskiy wrote: On 03/16/2016 04:10 PM, Eli Cohen wrote: On Wed, Mar 16, 2016 at 01:07:58PM +1100, Alexey Kardashevskiy wrote: So with v4.5 as a host, there is no actual distro available today

Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-16 Thread Alexey Kardashevskiy
On 03/16/2016 05:09 PM, Eli Cohen wrote: On Wed, Mar 16, 2016 at 04:49:00PM +1100, Alexey Kardashevskiy wrote: On 03/16/2016 04:10 PM, Eli Cohen wrote: On Wed, Mar 16, 2016 at 01:07:58PM +1100, Alexey Kardashevskiy wrote: So with v4.5 as a host, there is no actual distro available today

Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Alexey Kardashevskiy
On 03/16/2016 04:10 PM, Eli Cohen wrote: On Wed, Mar 16, 2016 at 01:07:58PM +1100, Alexey Kardashevskiy wrote: So with v4.5 as a host, there is no actual distro available today to use as a guest in the next 6 months (or whatever it takes to backport this partucular patch back there). You

Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Alexey Kardashevskiy
On 03/16/2016 04:10 PM, Eli Cohen wrote: On Wed, Mar 16, 2016 at 01:07:58PM +1100, Alexey Kardashevskiy wrote: So with v4.5 as a host, there is no actual distro available today to use as a guest in the next 6 months (or whatever it takes to backport this partucular patch back there). You

Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Alexey Kardashevskiy
On 03/15/2016 09:40 PM, Or Gerlitz wrote: On Tue, Mar 15, 2016 at 12:19 PM, Alexey Kardashevskiy <a...@ozlabs.ru> wrote: This reverts commit 85743f1eb34548ba4b056d2f184a3d107a3b8917. Without this revert, POWER "pseries" KVM guests with a VF passed to a guest using VFIO fail to

Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Alexey Kardashevskiy
On 03/15/2016 09:40 PM, Or Gerlitz wrote: On Tue, Mar 15, 2016 at 12:19 PM, Alexey Kardashevskiy wrote: This reverts commit 85743f1eb34548ba4b056d2f184a3d107a3b8917. Without this revert, POWER "pseries" KVM guests with a VF passed to a guest using VFIO fail to bring the driver up:

Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Alexey Kardashevskiy
On 03/16/2016 02:29 AM, Christoph Hellwig wrote: On Tue, Mar 15, 2016 at 04:23:33PM +0200, Or Gerlitz wrote: Let us check. I was under (the maybe wrong) impression, that before this patch both PF/VF drivers were not operative on some systems, so on those systems it's fair to require the VF

Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Alexey Kardashevskiy
On 03/16/2016 02:29 AM, Christoph Hellwig wrote: On Tue, Mar 15, 2016 at 04:23:33PM +0200, Or Gerlitz wrote: Let us check. I was under (the maybe wrong) impression, that before this patch both PF/VF drivers were not operative on some systems, so on those systems it's fair to require the VF

[RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Alexey Kardashevskiy
This reverts commit 85743f1eb34548ba4b056d2f184a3d107a3b8917. Without this revert, POWER "pseries" KVM guests with a VF passed to a guest using VFIO fail to bring the driver up: mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014) mlx4_core: Initializing :00:00.0 mlx4_core

[RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Alexey Kardashevskiy
This reverts commit 85743f1eb34548ba4b056d2f184a3d107a3b8917. Without this revert, POWER "pseries" KVM guests with a VF passed to a guest using VFIO fail to bring the driver up: mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014) mlx4_core: Initializing :00:00.0 mlx4_core

Re: [RFC PATCH v4 1/7] PCI: Add a new option for resource_alignment to reassign alignment

2016-03-09 Thread Alexey Kardashevskiy
On 03/07/2016 06:48 PM, Yongji Xie wrote: When using resource_alignment kernel parameter, the current implement reassigns the alignment by changing resources' size which can potentially break some drivers. How can this possibly break any driver?... It rounds up, not down, what do I miss here?

Re: [RFC PATCH v4 1/7] PCI: Add a new option for resource_alignment to reassign alignment

2016-03-09 Thread Alexey Kardashevskiy
On 03/07/2016 06:48 PM, Yongji Xie wrote: When using resource_alignment kernel parameter, the current implement reassigns the alignment by changing resources' size which can potentially break some drivers. How can this possibly break any driver?... It rounds up, not down, what do I miss here?

Re: [kernel] powerpc: Make vmalloc_to_phys() public

2016-01-26 Thread Alexey Kardashevskiy
On 01/25/2016 09:06 PM, Paul Mackerras wrote: On Mon, Jan 25, 2016 at 04:46:03PM +1100, Michael Ellerman wrote: On Thu, 2016-21-01 at 07:35:08 UTC, Alexey Kardashevskiy wrote: This makes vmalloc_to_phys() public as there will be another user (in-kernel VFIO acceleration) for it soon

Re: [kernel] powerpc: Make vmalloc_to_phys() public

2016-01-26 Thread Alexey Kardashevskiy
On 01/25/2016 09:06 PM, Paul Mackerras wrote: On Mon, Jan 25, 2016 at 04:46:03PM +1100, Michael Ellerman wrote: On Thu, 2016-21-01 at 07:35:08 UTC, Alexey Kardashevskiy wrote: This makes vmalloc_to_phys() public as there will be another user (in-kernel VFIO acceleration) for it soon

Re: [PATCH] vfio/noiommu: Don't use iommu_present() to track fake groups

2016-01-24 Thread Alexey Kardashevskiy
, so we use the address of the noiommu switch itself. Reported-by: Alexey Kardashevskiy Fixes: 03a76b60f8ba ("vfio: Include No-IOMMU mode") Signed-off-by: Alex Williamson Reviewed-by: Alexey Kardashevskiy Tested-by: Alexey Kardashevskiy Thanks! -- Alexey

Re: [PATCH] vfio/noiommu: Don't use iommu_present() to track fake groups

2016-01-24 Thread Alexey Kardashevskiy
, so we use the address of the noiommu switch itself. Reported-by: Alexey Kardashevskiy <a...@ozlabs.ru> Fixes: 03a76b60f8ba ("vfio: Include No-IOMMU mode") Signed-off-by: Alex Williamson <alex.william...@redhat.com> Reviewed-by: Alexey Kardashevskiy <a...@ozl

Re: [PATCH kernel] vfio: Only check for bus IOMMU when NOIOMMU is selected

2016-01-21 Thread Alexey Kardashevskiy
On 01/22/2016 05:34 PM, Alexey Kardashevskiy wrote: Recent change 03a76b60 "vfio: Include No-IOMMU mode" disabled VFIO on systems which do not implement iommu_ops for the PCI bus even though there is an VFIO IOMMU driver for it such as SPAPR TCE driver for PPC64/powernv platform.

[PATCH kernel] vfio: Only check for bus IOMMU when NOIOMMU is selected

2016-01-21 Thread Alexey Kardashevskiy
IO_NOIOMMU as it is done in the rest of the file to re-enable VFIO on powernv. Signed-off-by: Alexey Kardashevskiy --- drivers/vfio/vfio.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 82f25cc..3f8060e 100644 --- a/drivers/v

Re: [PATCH kernel] vfio: Only check for bus IOMMU when NOIOMMU is selected

2016-01-21 Thread Alexey Kardashevskiy
On 01/22/2016 05:34 PM, Alexey Kardashevskiy wrote: Recent change 03a76b60 "vfio: Include No-IOMMU mode" disabled VFIO on systems which do not implement iommu_ops for the PCI bus even though there is an VFIO IOMMU driver for it such as SPAPR TCE driver for PPC64/powernv platform.

[PATCH kernel] vfio: Only check for bus IOMMU when NOIOMMU is selected

2016-01-21 Thread Alexey Kardashevskiy
IO_NOIOMMU as it is done in the rest of the file to re-enable VFIO on powernv. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- drivers/vfio/vfio.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 82f25cc

[PATCH kernel] powerpc: Make vmalloc_to_phys() public

2016-01-20 Thread Alexey Kardashevskiy
will make gcc use multiply instructions instead of shifts. Signed-off-by: Alexey Kardashevskiy --- A couple of notes: 1. real_vmalloc_addr() will be reworked later by Paul separately; 2. the optimization note it not valid at the moment as vmalloc_to_pfn() calls vmalloc_to_page() which does

[PATCH kernel] powerpc: Make vmalloc_to_phys() public

2016-01-20 Thread Alexey Kardashevskiy
will make gcc use multiply instructions instead of shifts. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- A couple of notes: 1. real_vmalloc_addr() will be reworked later by Paul separately; 2. the optimization note it not valid at the moment as vmalloc_to_pfn() calls vmalloc_t

Re: [PATCH kernel] rcu: Define lockless version of list_for_each_entry_rcu

2015-12-21 Thread Alexey Kardashevskiy
On 12/08/2015 04:46 PM, Paul E. McKenney wrote: On Tue, Dec 08, 2015 at 04:20:03PM +1100, Paul Mackerras wrote: On Sat, Dec 05, 2015 at 06:19:46PM -0800, Paul E. McKenney wrote: As in the following? (And yes, I was confused by the fact that the docbook header was in front of a

Re: [PATCH kernel] rcu: Define lockless version of list_for_each_entry_rcu

2015-12-21 Thread Alexey Kardashevskiy
On 12/08/2015 04:46 PM, Paul E. McKenney wrote: On Tue, Dec 08, 2015 at 04:20:03PM +1100, Paul Mackerras wrote: On Sat, Dec 05, 2015 at 06:19:46PM -0800, Paul E. McKenney wrote: As in the following? (And yes, I was confused by the fact that the docbook header was in front of a

Re: [RFC PATCH 0/3] VFIO: capability chains

2015-12-17 Thread Alexey Kardashevskiy
On 12/18/2015 01:38 PM, Alex Williamson wrote: On Fri, 2015-12-18 at 13:05 +1100, Alexey Kardashevskiy wrote: On 11/24/2015 07:43 AM, Alex Williamson wrote: Please see the commit log and comments in patch 1 for a general explanation of the problems that this series tries to address

Re: [RFC PATCH 0/3] VFIO: capability chains

2015-12-17 Thread Alexey Kardashevskiy
On 11/24/2015 07:43 AM, Alex Williamson wrote: Please see the commit log and comments in patch 1 for a general explanation of the problems that this series tries to address. The general problem is that we have several cases where we want to expose variable sized information to the user, whether

[PATCH kernel] vfio: Add explicit alignments in vfio_iommu_spapr_tce_create

2015-12-17 Thread Alexey Kardashevskiy
not cause any change in behavior. Signed-off-by: Alexey Kardashevskiy --- include/uapi/linux/vfio.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 9fd7b5d..d117233 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux

Re: [RFC PATCH 0/3] VFIO: capability chains

2015-12-17 Thread Alexey Kardashevskiy
On 11/24/2015 07:43 AM, Alex Williamson wrote: Please see the commit log and comments in patch 1 for a general explanation of the problems that this series tries to address. The general problem is that we have several cases where we want to expose variable sized information to the user, whether

[PATCH kernel] vfio: Add explicit alignments in vfio_iommu_spapr_tce_create

2015-12-17 Thread Alexey Kardashevskiy
not cause any change in behavior. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- include/uapi/linux/vfio.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 9fd7b5d..d117233 100644 --- a/include/uapi/linux/vfio.h +++ b/i

Re: [RFC PATCH 0/3] VFIO: capability chains

2015-12-17 Thread Alexey Kardashevskiy
On 12/18/2015 01:38 PM, Alex Williamson wrote: On Fri, 2015-12-18 at 13:05 +1100, Alexey Kardashevskiy wrote: On 11/24/2015 07:43 AM, Alex Williamson wrote: Please see the commit log and comments in patch 1 for a general explanation of the problems that this series tries to address

Re: [PATCH kernel] rcu: Define lockless version of list_for_each_entry_rcu

2015-11-05 Thread Alexey Kardashevskiy
On 11/04/2015 01:39 AM, Steven Rostedt wrote: On Tue, 3 Nov 2015 17:57:05 +1100 Alexey Kardashevskiy wrote: This defines list_for_each_entry_lockless. This allows safe list traversing in cases when lockdep() invocation is unwanted like real mode (MMU is off). Signed-off-by: Alexey

Re: [PATCH kernel] rcu: Define lockless version of list_for_each_entry_rcu

2015-11-05 Thread Alexey Kardashevskiy
On 11/04/2015 01:39 AM, Steven Rostedt wrote: On Tue, 3 Nov 2015 17:57:05 +1100 Alexey Kardashevskiy <a...@ozlabs.ru> wrote: This defines list_for_each_entry_lockless. This allows safe list traversing in cases when lockdep() invocation is unwanted like real mode (MMU is off). Sign

[PATCH kernel] rcu: Define lockless version of list_for_each_entry_rcu

2015-11-02 Thread Alexey Kardashevskiy
This defines list_for_each_entry_lockless. This allows safe list traversing in cases when lockdep() invocation is unwanted like real mode (MMU is off). Signed-off-by: Alexey Kardashevskiy --- This is for VFIO acceleration in POWERKVM for pSeries guests. There is a KVM instance. There also can

[PATCH kernel] rcu: Define lockless version of list_for_each_entry_rcu

2015-11-02 Thread Alexey Kardashevskiy
This defines list_for_each_entry_lockless. This allows safe list traversing in cases when lockdep() invocation is unwanted like real mode (MMU is off). Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- This is for VFIO acceleration in POWERKVM for pSeries guests. There is a KVM in

[PATCH kernel] rcu: Fix comment for rcu_dereference_raw_notrace

2015-11-01 Thread Alexey Kardashevskiy
rcu_dereference_raw() calls indirectly rcu_read_lock_held() while rcu_dereference_raw_notrace() does not so fix the comment about the latter. Signed-off-by: Alexey Kardashevskiy --- include/linux/rcupdate.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux

[PATCH kernel] rcu: Fix comment for rcu_dereference_raw_notrace

2015-11-01 Thread Alexey Kardashevskiy
rcu_dereference_raw() calls indirectly rcu_read_lock_held() while rcu_dereference_raw_notrace() does not so fix the comment about the latter. Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> --- include/linux/rcupdate.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff

Re: [PATCH 2/7 v2] powerpc/dma-mapping: override dma_get_page_shift

2015-10-27 Thread Alexey Kardashevskiy
On 10/28/2015 09:27 AM, Nishanth Aravamudan wrote: On 27.10.2015 [17:02:16 +1100], Alexey Kardashevskiy wrote: On 10/24/2015 07:57 AM, Nishanth Aravamudan wrote: On Power, the kernel's page size can differ from the IOMMU's page size, so we need to override the generic implementation, which

Re: [PATCH 2/7 v2] powerpc/dma-mapping: override dma_get_page_shift

2015-10-27 Thread Alexey Kardashevskiy
On 10/24/2015 07:57 AM, Nishanth Aravamudan wrote: On Power, the kernel's page size can differ from the IOMMU's page size, so we need to override the generic implementation, which always returns the kernel's page size. Lookup the IOMMU's page size from struct iommu_table, if available. Fallback

Re: [PATCH 2/7 v2] powerpc/dma-mapping: override dma_get_page_shift

2015-10-27 Thread Alexey Kardashevskiy
On 10/24/2015 07:57 AM, Nishanth Aravamudan wrote: On Power, the kernel's page size can differ from the IOMMU's page size, so we need to override the generic implementation, which always returns the kernel's page size. Lookup the IOMMU's page size from struct iommu_table, if available. Fallback

Re: [PATCH 2/7 v2] powerpc/dma-mapping: override dma_get_page_shift

2015-10-27 Thread Alexey Kardashevskiy
On 10/28/2015 09:27 AM, Nishanth Aravamudan wrote: On 27.10.2015 [17:02:16 +1100], Alexey Kardashevskiy wrote: On 10/24/2015 07:57 AM, Nishanth Aravamudan wrote: On Power, the kernel's page size can differ from the IOMMU's page size, so we need to override the generic implementation, which

Re: [PATCH 4/7 v2] pseries/iommu: implement DDW-aware dma_get_page_shift

2015-10-26 Thread Alexey Kardashevskiy
On 10/24/2015 07:59 AM, Nishanth Aravamudan wrote: When DDW (Dynamic DMA Windows) are present for a device, we have stored the TCE (Translation Control Entry) size in a special device tree property. Check if we have enabled DDW for the device and return the TCE size from that property if

Re: [PATCH 4/7 v2] pseries/iommu: implement DDW-aware dma_get_page_shift

2015-10-26 Thread Alexey Kardashevskiy
On 10/24/2015 07:59 AM, Nishanth Aravamudan wrote: When DDW (Dynamic DMA Windows) are present for a device, we have stored the TCE (Translation Control Entry) size in a special device tree property. Check if we have enabled DDW for the device and return the TCE size from that property if

Re: [RFC PATCH kernel vfio] mm: vfio: Move pages out of CMA before pinning

2015-08-17 Thread Alexey Kardashevskiy
On 08/17/2015 05:45 PM, Vlastimil Babka wrote: On 08/05/2015 10:08 AM, Alexey Kardashevskiy wrote: This is about VFIO aka PCI passthrough used from QEMU. KVM is irrelevant here. QEMU is a machine emulator. It allocates guest RAM from anonymous memory and these pages are movable which is ok

Re: [RFC PATCH kernel vfio] mm: vfio: Move pages out of CMA before pinning

2015-08-17 Thread Alexey Kardashevskiy
On 08/17/2015 05:45 PM, Vlastimil Babka wrote: On 08/05/2015 10:08 AM, Alexey Kardashevskiy wrote: This is about VFIO aka PCI passthrough used from QEMU. KVM is irrelevant here. QEMU is a machine emulator. It allocates guest RAM from anonymous memory and these pages are movable which is ok

Re: [RFC PATCH kernel vfio] mm: vfio: Move pages out of CMA before pinning

2015-08-15 Thread Alexey Kardashevskiy
On 08/05/2015 06:08 PM, Alexey Kardashevskiy wrote: This is about VFIO aka PCI passthrough used from QEMU. KVM is irrelevant here. Anyone, any idea? Or the question is way too stupid? :) QEMU is a machine emulator. It allocates guest RAM from anonymous memory and these pages are movable

Re: [RFC PATCH kernel vfio] mm: vfio: Move pages out of CMA before pinning

2015-08-15 Thread Alexey Kardashevskiy
On 08/05/2015 06:08 PM, Alexey Kardashevskiy wrote: This is about VFIO aka PCI passthrough used from QEMU. KVM is irrelevant here. Anyone, any idea? Or the question is way too stupid? :) QEMU is a machine emulator. It allocates guest RAM from anonymous memory and these pages are movable

[RFC PATCH kernel vfio] mm: vfio: Move pages out of CMA before pinning

2015-08-05 Thread Alexey Kardashevskiy
in madvise() to address this (could not locate any relevant)? - what else is missing? disabled interrupts? locks? Thanks! Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/mm/mmu_context_iommu.c | 40 +++-- mm/page_alloc.c | 36

[RFC PATCH kernel vfio] mm: vfio: Move pages out of CMA before pinning

2015-08-05 Thread Alexey Kardashevskiy
in madvise() to address this (could not locate any relevant)? - what else is missing? disabled interrupts? locks? Thanks! Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- arch/powerpc/mm/mmu_context_iommu.c | 40 +++-- mm/page_alloc.c

Re: [PATCH kernel] powerpc/powernv/ioda2: Fix calculation for memory allocated for TCE table

2015-07-23 Thread Alexey Kardashevskiy
On 07/21/2015 04:24 PM, Michael Ellerman wrote: On Mon, 2015-07-20 at 20:45 +1000, Alexey Kardashevskiy wrote: The existing code stores the amount of memory allocated for a TCE table. At the moment it uses @offset which is a virtual offset in the TCE table which is only correct for a one level

<    1   2   3   4   5   6   7   8   9   10   >