[PATCH kernel v3] KVM: PPC: Allocate guest TCEs on demand too

2019-02-28 Thread Alexey Kardashevskiy
) and kvm_spapr_tce_fault(). This adds kvmppc_rm_ioba_validate() to do an additional test if the consequent kvmppc_tce_put() needs a page which has not been allocated; if this is the case, we bail out to virtual mode handlers. Signed-off-by: Alexey Kardashevskiy --- Changes: v3: * fixed alignmen

[PATCH kernel v2] KVM: PPC: Allocate guest TCEs on demand too

2019-02-28 Thread Alexey Kardashevskiy
) and kvm_spapr_tce_fault(). This adds kvmppc_rm_ioba_validate() to do an additional test if the consequent kvmppc_tce_put() needs a page which has not been allocated; if this is the case, we bail out to virtual mode handlers. Signed-off-by: Alexey Kardashevskiy --- Changes: v3: * fixed alignmen

Re: [PATCH kernel v2] KVM: PPC: Allocate guest TCEs on demand too

2019-02-28 Thread Alexey Kardashevskiy
On 01/03/2019 12:38, Alexey Kardashevskiy wrote: > We already allocate hardware TCE tables in multiple levels and skip > intermediate levels when we can, now it is a turn of the KVM TCE tables. > Thankfully these are allocated already in 2 levels. > > This moves the tab

[PATCH kernel v2] KVM: PPC: Allocate guest TCEs on demand too

2019-02-28 Thread Alexey Kardashevskiy
) and kvm_spapr_tce_fault(). This adds kvmppc_rm_ioba_validate() to do an additional test if the consequent kvmppc_tce_put() needs a page which has not been allocated; if this is the case, we bail out to virtual mode handlers. Signed-off-by: Alexey Kardashevskiy --- Changes: v2: * added kvm mutex a

[PATCH kernel] KVM: PPC: Allocate guest TCEs on demand too

2019-02-27 Thread Alexey Kardashevskiy
) and kvm_spapr_tce_fault(). This adds kvmppc_rm_ioba_validate() to do an additional test if the consequent kvmppc_tce_put() needs a page which has not been allocated; if this is the case, we bail out to virtual mode handlers. Signed-off-by: Alexey Kardashevskiy --- For NVLink2 passthrough guests with 1

Re: [PATCH kernel] KVM: PPC: Improve KVM reference counting

2019-02-21 Thread Alexey Kardashevskiy
On 21/02/2019 17:26, Michael Ellerman wrote: > Alexey Kardashevskiy writes: > >> The anon fd's ops releases the KVM reference in the release hook. >> However we reference the KVM object after we create the fd so there is >> small window when the relea

[PATCH kernel] KVM: PPC: Improve KVM reference counting

2019-02-20 Thread Alexey Kardashevskiy
d-off-by: Alexey Kardashevskiy --- The original bug is described here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cfa393811 But in this case kvm_put_kvm() is called straight away with no locks before/after/around. --- arch/powerpc/kvm/book3s_64_vio.c | 7 -

Re: [PATCH kernel] powerpc/powernv/sriov: Register IOMMU groups for VFs

2019-02-17 Thread Alexey Kardashevskiy
On 18/02/2019 16:58, Alexey Kardashevskiy wrote: > The compound IOMMU group rework moved iommu_register_group() together in > pnv_pci_ioda_setup_iommu_api() (which is a part of ppc_md.pcibios_fixup). > As the result, pnv_ioda_setup_bus_iommu_group() does not create groups > any m

[PATCH kernel] powerpc/powernv/sriov: Register IOMMU groups for VFs

2019-02-17 Thread Alexey Kardashevskiy
;t stop in xmon if there is no group, although it is not expected to happen now. Fixes: 0bd971676e68 "powerpc/powernv/npu: Add compound IOMMU groups" Signed-off-by: Alexey Kardashevskiy --- Fixes https://bugzilla.linux.ibm.com/show_bug.cgi?id=175550 "Kernel Oops while creating

[PATCH kernel v2] powerpc/powernv/ioda: Fix locked_vm counting for memory used by IOMMU tables

2019-02-12 Thread Alexey Kardashevskiy
te indirect TCE levels on demand" Signed-off-by: Alexey Kardashevskiy --- Changes: v2: * this is reworked "[PATCH kernel] powerpc/powernv/ioda: Store correct amount of memory used for table" --- arch/powerpc/platforms/powernv/pci-ioda-tce.c | 1 - arch/powerpc/platforms/powernv/pci-i

Re: [PATCH 2/5] vfio/spapr_tce: use pinned_vm instead of locked_vm to account pinned pages

2019-02-12 Thread Alexey Kardashevskiy
On 13/02/2019 04:18, Daniel Jordan wrote: > On Tue, Feb 12, 2019 at 04:50:11PM +, Christopher Lameter wrote: >> On Tue, 12 Feb 2019, Alexey Kardashevskiy wrote: >> >>> Now it is 3 independent accesses (actually 4 but the last one is >>> diagnostic) with no l

Re: [PATCH 2/5] vfio/spapr_tce: use pinned_vm instead of locked_vm to account pinned pages

2019-02-12 Thread Alexey Kardashevskiy
On 13/02/2019 05:56, Alex Williamson wrote: > On Tue, 12 Feb 2019 17:56:18 +1100 > Alexey Kardashevskiy wrote: > >> On 12/02/2019 09:44, Daniel Jordan wrote: >>> Beginning with bc3e53f682d9 ("mm: distinguish between mlocked and pinned >>> pages&qu

Re: [PATCH kernel] vfio/spapr_tce: Skip unsetting already unset table

2019-02-12 Thread Alexey Kardashevskiy
On 13/02/2019 07:52, Alex Williamson wrote: > On Mon, 11 Feb 2019 18:49:17 +1100 > Alexey Kardashevskiy wrote: > >> VFIO TCE IOMMU v2 owns IOMMU tables so when detach a IOMMU group from >> a container, we need to unset those from a group so we call unset_window() >&g

Re: [PATCH kernel] powerpc/powernv/ioda: Store correct amount of memory used for table

2019-02-12 Thread Alexey Kardashevskiy
On 12/02/2019 18:33, Alexey Kardashevskiy wrote: > > > On 12/02/2019 11:20, David Gibson wrote: >> On Mon, Feb 11, 2019 at 06:48:01PM +1100, Alexey Kardashevskiy wrote: >>> We store 2 multilevel tables in iommu_table - one for the hardware and >>> one

Re: [PATCH kernel] powerpc/powernv/ioda: Store correct amount of memory used for table

2019-02-11 Thread Alexey Kardashevskiy
On 12/02/2019 11:20, David Gibson wrote: > On Mon, Feb 11, 2019 at 06:48:01PM +1100, Alexey Kardashevskiy wrote: >> We store 2 multilevel tables in iommu_table - one for the hardware and >> one with the corresponding userspace addresses. Before allocating >

Re: [PATCH 2/5] vfio/spapr_tce: use pinned_vm instead of locked_vm to account pinned pages

2019-02-11 Thread Alexey Kardashevskiy
On 12/02/2019 09:44, Daniel Jordan wrote: > Beginning with bc3e53f682d9 ("mm: distinguish between mlocked and pinned > pages"), locked and pinned pages are accounted separately. The SPAPR > TCE VFIO IOMMU driver accounts pinned pages to locked_vm; use pinned_vm > instead. > > pinned_vm recentl

[PATCH kernel] KVM: PPC: Release all hardware TCE tables attached to a group

2019-02-11 Thread Alexey Kardashevskiy
t;KVM: PPC: VFIO: Add in-kernel acceleration for VFIO" Signed-off-by: Alexey Kardashevskiy --- I kinda hoped to blame RCU for misbehaviour but it was me all over again :) --- arch/powerpc/kvm/book3s_64_vio.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_64_vi

[PATCH kernel] vfio/spapr_tce: Skip unsetting already unset table

2019-02-10 Thread Alexey Kardashevskiy
: [PE# fd] Removing DMA window #1 At the moment this is not a problem as the second invocation of unset_window() writes zeroes to the HW registers again and exits early as there is no table. Signed-off-by: Alexey Kardashevskiy --- When doing VFIO PCI hot unplug, first we remove the DMA window

[PATCH kernel] powerpc/powernv/ioda: Store correct amount of memory used for table

2019-02-10 Thread Alexey Kardashevskiy
irect levels to it_userspace" Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/pci-ioda-tce.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda-tce.c b/arch/powerpc/platforms/powernv/pci-ioda-tce.c index 697449

Re: [PATCH] powerpc/powernv: Escalate reset when IODA reset fails

2019-02-03 Thread Alexey Kardashevskiy
he device in some cases, so this patch just adds a test to force > a full reset if firmware reports an error when performing the IODA reset. > > Signed-off-by: Oliver O'Halloran I am pretty sure I already saw this :-/ ah, anyway Reviewed-by: Alexey Kardashevskiy > --- >

Re: [PATCH] powerpc/mm: Add _PAGE_SAO to _PAGE_CACHE_CTL mask

2019-01-31 Thread Alexey Kardashevskiy
On 31/01/2019 00:35, Michael Ellerman wrote: > Reza Arbab writes: > >> On Tue, Jan 29, 2019 at 08:37:28PM +0530, Aneesh Kumar K.V wrote: >>> Not sure what the fix is about. We set the related hash pte flags via >>> >>> if ((pteflags & _PAGE_CACHE_CTL) == _PAGE_TOLERANT) >>> rfl

Re: [PATCH] powerpc/mm: Add _PAGE_SAO to _PAGE_CACHE_CTL mask

2019-01-28 Thread Alexey Kardashevskiy
On 29/01/2019 04:31, Reza Arbab wrote: > In htab_convert_pte_flags(), _PAGE_CACHE_CTL is used to check for the > _PAGE_SAO flag: > > else if ((pteflags & _PAGE_CACHE_CTL) == _PAGE_SAO) > rflags |= (HPTE_R_W | HPTE_R_I | HPTE_R_M); > > But, it isn't defined to include that flag: >

[PATCH kernel] vfio-pci/nvlink2: Fix ancient gcc warnings

2019-01-22 Thread Alexey Kardashevskiy
we still want to compile the modern kernel with such an old gcc without warnings, this changes the capabilities initialization. The gcc bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53119 Signed-off-by: Alexey Kardashevskiy --- drivers/vfio/pci/vfio_pci_nvlink2.c | 30 ++-

Re: [PATCH 2/3] kbuild: add real-prereqs shorthand for $(filter-out FORCE, $^)

2019-01-16 Thread Alexey Kardashevskiy
On 17/01/2019 15:10, Masahiro Yamada wrote: > In Kbuild, if_changed and friends must have FORCE as a prerequisite. > > Hence, $(filter-out FORCE,$^) or $(filter-out $(PHONY),$^) is a common > pattern to get the names of all the prerequisites except phony targets. > > Add real-prereqs as a shor

Re: [PATCH] kbuild: mark prepare0 as PHONY to fix external module build

2019-01-15 Thread Alexey Kardashevskiy
x single target build for external module") > Fixes: c3ff2a5193fa ("powerpc/32: add stack protector support") > Fixes: 189af4657186 ("ARM: smp: add support for per-task stack canaries") > Fixes: 0a1213fa7432 ("arm64: enable per-task stack canaries") >

Re: [PATCH] powerpc: PCI does not require PowerNV

2019-01-14 Thread Alexey Kardashevskiy
ut PowerNV. > > Signed-off-by: Jason A. Donenfeld > Fixes: 0e759bd75285 ("powerpc/powernv/npu: Move OPAL calls away from context > manipulation") > Cc: Alexey Kardashevskiy > Cc: Michael Ellerman Reviewed-by: Alexey Kardashevskiy > --- > arch/powerpc/pla

[PATCH kernel 2/2] powerpc/powernv/npu: Move platform shared code to sysdev

2019-01-13 Thread Alexey Kardashevskiy
27;vmlinux' failed make[1]: *** [vmlinux] Error 1 Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/sysdev/Makefile | 1 + arch/powerpc/include/asm/npu.h | 88 +++ arch/powerpc/include/asm/pci.h | 8 +- arch/powerpc/include/asm/powernv.h | 24 -

[PATCH kernel 0/2] powerpc/powernv/npu: Move platform shared code to sysdev

2019-01-13 Thread Alexey Kardashevskiy
, pnv_pci_get_npu_dev, pnv_npu2_handle_fault). This is based on sha1 574823bf Linus Torvalds "Change mincore() to count "mapped" pages rather than "cached" pages". Please comment. Thanks. Alexey Kardashevskiy (2): powerpc/powernv/npu: Move compound PEs to power

[PATCH kernel 1/2] powerpc/powernv/npu: Move compound PEs to powernv

2019-01-13 Thread Alexey Kardashevskiy
o the pnv_ohb struct so we can still have the definition of it in npu-dma.c (the only place which uses it) and save some bytes when there is no NPU. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/pci.h | 4 arch/powerpc/platforms/powernv/npu-dma.c | 10 +++--- 2

[PATCH kernel] powerpc/powernv/npu: Remove obsolete comment about TCE_KILL_INVAL_ALL

2019-01-13 Thread Alexey Kardashevskiy
TCE_KILL_INVAL_ALL has moved long ago but the comment was forgotted so finish the move and remove the comment. Fixes: 0bbcdb437da0c4a "powerpc/powernv/npu: TCE Kill helpers cleanup" Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/npu-dma.c | 4 1 file

[PATCH kernel] powerpc/powernv: Remove never used pnv_power9_force_smt4

2019-01-13 Thread Alexey Kardashevskiy
ot; Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/include/asm/powernv.h | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/powerpc/include/asm/powernv.h b/arch/powerpc/include/asm/powernv.h index 2f3ff7a..362ea12 100644 --- a/arch/powerpc/include/asm/powernv.h +++ b/arch/powerpc/i

[PATCH kernel] KVM: PPC: Fix compile when CONFIG_PPC_RADIX_MMU is not defined

2019-01-13 Thread Alexey Kardashevskiy
This adds some stubs for hash only configs. Signed-off-by: Alexey Kardashevskiy --- .../include/asm/book3s/64/tlbflush-radix.h| 30 +++ 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include

Re: [RFC PATCH kernel] powerpc/stack_protector: Fix external modules building

2019-01-10 Thread Alexey Kardashevskiy
On 11/01/2019 14:08, Masahiro Yamada wrote: > On Thu, Jan 10, 2019 at 2:44 PM Alexey Kardashevskiy wrote: >> >> c3ff2a519 "powerpc/32: add stack protector support" addes stack protector >> support so now powerpc's "prepare" target depends on p

Re: [PATCH] powerpc/powernv/npu: Allocate enough memory in pnv_try_setup_npu_table_group()

2019-01-10 Thread Alexey Kardashevskiy
U groups") >> Signed-off-by: Dan Carpenter >> --- >> arch/powerpc/platforms/powernv/npu-dma.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) > > Thanks, I've applied this to my fixes-test tree. > > Alexey can you send me an ack? Ouch. Re

[RFC PATCH kernel] powerpc/stack_protector: Fix external modules building

2019-01-09 Thread Alexey Kardashevskiy
stack_protector_prepare'. Stop. The reason for that is that the main Linux Makefile defines "prepare0" only if KBUILD_EXTMOD=="". This hacks powerpc's Makefile to make external modules build again. Fixes: c3ff2a519 "powerpc/32: add stack protector support" Si

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-09 Thread Alexey Kardashevskiy
On 09/01/2019 18:24, Benjamin Herrenschmidt wrote: > On Wed, 2019-01-09 at 15:53 +1100, Alexey Kardashevskiy wrote: >> "A PCI completion timeout occurred for an outstanding PCI-E transaction" >> it is. >> >> This is how I bind the device to vfio: >> &g

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-09 Thread Alexey Kardashevskiy
On 09/01/2019 18:25, Benjamin Herrenschmidt wrote: > On Wed, 2019-01-09 at 17:32 +1100, Alexey Kardashevskiy wrote: >> I have just moved the "Mellanox Technologies MT27700 Family >> [ConnectX-4]" from garrison to firestone machine and there it does not >> produ

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-08 Thread Alexey Kardashevskiy
On 09/01/2019 16:30, David Gibson wrote: > On Wed, Jan 09, 2019 at 04:09:02PM +1100, Benjamin Herrenschmidt wrote: >> On Mon, 2019-01-07 at 21:01 -0700, Jason Gunthorpe wrote: >>> In a very cryptic way that requires manual parsing using non-public docs sadly but yes. From the look of i

Re: [PATCH] PCI: Add no-D3 quirk for Mellanox ConnectX-[45]

2019-01-08 Thread Alexey Kardashevskiy
On 06/01/2019 09:43, Benjamin Herrenschmidt wrote: > On Sat, 2019-01-05 at 10:51 -0700, Jason Gunthorpe wrote: >> >>> Interesting. I've investigated this further, though I don't have as >>> many new clues as I'd like. The problem occurs reliably, at least on >>> one particular type of machine

Re: [PATCH kernel v7 20/20] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver

2018-12-20 Thread Alexey Kardashevskiy
On 21/12/2018 12:37, Alex Williamson wrote: > On Fri, 21 Dec 2018 12:23:16 +1100 > Alexey Kardashevskiy wrote: > >> On 21/12/2018 03:46, Alex Williamson wrote: >>> On Thu, 20 Dec 2018 19:23:50 +1100 >>> Alexey Kardashevskiy wrote: >>> >>>

Re: [PATCH kernel v7 20/20] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver

2018-12-20 Thread Alexey Kardashevskiy
On 21/12/2018 03:46, Alex Williamson wrote: > On Thu, 20 Dec 2018 19:23:50 +1100 > Alexey Kardashevskiy wrote: > >> POWER9 Witherspoon machines come with 4 or 6 V100 GPUs which are not >> pluggable PCIe devices but still have PCIe links which are used >> for config

Re: [PATCH kernel v7 00/20] powerpc/powernv/npu, vfio: NVIDIA V100 + P9 passthrough

2018-12-20 Thread Alexey Kardashevskiy
On 20/12/2018 20:38, Michael Ellerman wrote: > Alexey Kardashevskiy writes: > >> My bad, I was not cc-ing everyone but now with v7 I am, sorry about that. > > I've already applied v6, I'll assume this is unchanged from that unless > you tell me otherwise.

[PATCH kernel v7 19/20] vfio_pci: Allow regions to add own capabilities

2018-12-20 Thread Alexey Kardashevskiy
VFIO regions already support region capabilities with a limited set of fields. However the subdriver might have to report to the userspace additional bits. This adds an add_capability() hook to vfio_pci_regops. Signed-off-by: Alexey Kardashevskiy Acked-by: Alex Williamson --- Changes: v3

[PATCH kernel v7 20/20] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver

2018-12-20 Thread Alexey Kardashevskiy
o this relies on the platform to tell whether these GPUs have special abilities such as NVLinks. Signed-off-by: Alexey Kardashevskiy --- Changes: v6.1: * fixed outdated comment about VFIO_REGION_INFO_CAP_NVLINK2_LNKSPD v6: * reworked capabilities - tgt for nvlink and gpu and link-speed for nvlin

[PATCH kernel v7 18/20] vfio_pci: Allow mapping extra regions

2018-12-20 Thread Alexey Kardashevskiy
have 16GB RAM which is coherently mapped to the system address space, we are going to export these as an extra PCI region. We already support extra PCI regions and this adds support for mapping them to the userspace. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson Acked-by: Alex

[PATCH kernel v7 09/20] powerpc/powernv/pseries: Rework device adding to IOMMU groups

2018-12-20 Thread Alexey Kardashevskiy
table between partitionable endpoints), this removes iommu_table_group_link from pseries. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson --- arch/powerpc/include/asm/iommu.h | 12 ++--- arch/powerpc/kernel/iommu.c | 58 ++- arch/powerpc

[PATCH kernel v7 17/20] powerpc/powernv/npu: Fault user page into the hypervisor's pagetable

2018-12-20 Thread Alexey Kardashevskiy
scope tree does not get updated so ATS will still fail. This reads a byte from an effective address which causes HV storage interrupt and KVM updates the partition scope tree. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/npu-dma.c | 13 +++-- 1 file changed, 7

[PATCH kernel v7 16/20] powerpc/powernv/npu: Check mmio_atsd array bounds when populating

2018-12-20 Thread Alexey Kardashevskiy
A broken device tree might contain more than 8 values and introduce hard to debug memory corruption bug. This adds the boundary check. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/npu-dma.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch

[PATCH kernel v7 15/20] powerpc/powernv/npu: Add release_ownership hook

2018-12-20 Thread Alexey Kardashevskiy
helper takes MSR (only DR/HV/PR/SF bits are allowed) to program them into NPU2 for ATS checkout requests support. This exports pnv_npu2_unmap_lpar_dev() as following patches will use it from the VFIO driver. Signed-off-by: Alexey Kardashevskiy --- Changes: v5: * removed opal_purge_cache as it is

[PATCH kernel v7 14/20] powerpc/powernv/npu: Add compound IOMMU groups

2018-12-20 Thread Alexey Kardashevskiy
ndlers. This moves IOMMU group registration for NVLink-connected GPUs to npu-dma.c. For POWER8, this stores a new compound group pointer in the PE (so a GPU is still a master); for POWER9 the new group pointer is stored in an NPU (which is allocated per a PCI host controller). Signed-o

[PATCH kernel v7 05/20] powerpc/powernv/npu: Move OPAL calls away from context manipulation

2018-12-20 Thread Alexey Kardashevskiy
keeps pnv_npu2_map_lpar() in powernv as pseries is not allowed to call that. This exports pnv_npu2_map_lpar_dev() as following patches will use it from the VFIO driver. While at it, replace redundant list_for_each_entry_safe() with a simpler list_for_each_entry(). Signed-off-by: Alexey Kardashe

[PATCH kernel v7 13/20] powerpc/powernv/npu: Convert NPU IOMMU helpers to iommu_table_group_ops

2018-12-20 Thread Alexey Kardashevskiy
. This makes a first step and converts an NPU PE with a set of extern function to a table group. This should cause no behavioral change. Note that pnv_npu_release_ownership() has never been implemented. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson --- arch/powerpc/platforms/powernv

[PATCH kernel v7 04/20] powerpc/powernv: Move npu struct from pnv_phb to pci_controller

2018-12-20 Thread Alexey Kardashevskiy
different data structure. This makes npu a pointer and stores it one level higher in the pci_controller struct. Signed-off-by: Alexey Kardashevskiy --- Changes: v5: * removed !npu checks as this is out of scope of this patch * added WARN_ON_ONCE in WARN_ON_ONCE(pnv_npu2_init(phb)) v4: * changed subj

[PATCH kernel v7 12/20] powerpc/powernv/npu: Move single TVE handling to NPU PE

2018-12-20 Thread Alexey Kardashevskiy
PCI PE and cannot to NPU PE and if that fails, we could only set 32bit table to NPU PE and this configuration is not really supported or wanted. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/npu-dma.c | 8 +++ arch/powerpc/platforms/powernv/pci-i

[PATCH kernel v7 02/20] powerpc/mm/iommu/vfio_spapr_tce: Change mm_iommu_get to reference a region

2018-12-20 Thread Alexey Kardashevskiy
. This removes the check for exact match from mm_iommu_new() as we want it to fail on existing regions; mm_iommu_get() should be used instead. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson --- Changes: v5: * fixed a bug with uninitialized @found in tce_iommu_unregister_pages

[PATCH kernel v7 11/20] powerpc/powernv: Reference iommu_table while it is linked to a group

2018-12-20 Thread Alexey Kardashevskiy
The iommu_table pointer stored in iommu_table_group may get stale by accident, this adds referencing and removes a redundant comment about this. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson --- arch/powerpc/platforms/powernv/pci-ioda-tce.c | 3 ++- arch/powerpc/platforms

[PATCH kernel v7 10/20] powerpc/iommu_api: Move IOMMU groups setup to a single place

2018-12-20 Thread Alexey Kardashevskiy
pci_controller_ops::setup_bridge (the normal way of adding PEs). Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson --- Changes: v5: * fixed compile with defined but not used pnv_ioda_setup_bus_iommu_group(); unfortunately defining a dummy version looks uglier than #ifdef --- arch

[PATCH kernel v7 01/20] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-12-20 Thread Alexey Kardashevskiy
pci_disable_device() which switches NPU2 to a safe mode and prevents HMIs. Signed-off-by: Alexey Kardashevskiy Acked-by: Alistair Popple Reviewed-by: David Gibson --- Changes: v2: * updated the commit log --- arch/powerpc/platforms/powernv/pci-ioda.c | 10 ++ 1 file changed, 10

[PATCH kernel v7 00/20] powerpc/powernv/npu, vfio: NVIDIA V100 + P9 passthrough

2018-12-20 Thread Alexey Kardashevskiy
patches have changelogs. v7 fixes compile warning and updates a VFIO capability comment in 20/20. Please comment. Thanks. Alexey Kardashevskiy (20): powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2 powerpc/mm/iommu/vfio_spapr_tce: Change mm_iommu_get to refere

[PATCH kernel v7 08/20] powerpc/pseries: Remove IOMMU API support for non-LPAR systems

2018-12-20 Thread Alexey Kardashevskiy
The pci_dma_bus_setup_pSeries and pci_dma_dev_setup_pSeries hooks are registered for the pseries platform which does not have FW_FEATURE_LPAR; these would be pre-powernv platforms which we never supported PCI pass through for anyway so remove it. Signed-off-by: Alexey Kardashevskiy Reviewed-by

[PATCH kernel v7 07/20] powerpc/pseries/npu: Enable platform support

2018-12-20 Thread Alexey Kardashevskiy
We already changed NPU API for GPUs to not to call OPAL and the remaining bit is initializing NPU structures. This searches for POWER9 NVLinks attached to any device on a PHB and initializes an NPU structure if any found. Signed-off-by: Alexey Kardashevskiy --- Changes: v5: * added WARN_ON_ONCE

[PATCH kernel v7 06/20] powerpc/pseries/iommu: Use memory@ nodes in max RAM address calculation

2018-12-20 Thread Alexey Kardashevskiy
DMA so the DMA window needs to cover those memory regions too; if the window cannot cover new memory regions, the memory onlining fails. This walks through the memory nodes to find the highest RAM address to let a huge DMA window cover that too in case this memory gets onlined later. Signed

[PATCH kernel v7 03/20] powerpc/vfio/iommu/kvm: Do not pin device memory

2018-12-20 Thread Alexey Kardashevskiy
e of thise new regions which we must avoid unpinning of. This adds @mm to tce_page_is_contained() and iommu_tce_xchg() to test if the memory is device memory to avoid pfn_to_page(). This adds a check for device memory in mm_iommu_ua_mark_dirty_rm() which does delayed pages dirtying. Signed-off-

Re: [PATCH V5 1/3] mm: Add get_user_pages_cma_migrate

2018-12-19 Thread Alexey Kardashevskiy
On 20/12/2018 16:52, Aneesh Kumar K.V wrote: > On 12/20/18 11:18 AM, Alexey Kardashevskiy wrote: >> >> >> On 20/12/2018 16:22, Aneesh Kumar K.V wrote: >>> On 12/20/18 9:49 AM, Alexey Kardashevskiy wrote: >>>> >>>> >>>> On

Re: [PATCH V5 1/3] mm: Add get_user_pages_cma_migrate

2018-12-19 Thread Alexey Kardashevskiy
On 20/12/2018 16:22, Aneesh Kumar K.V wrote: > On 12/20/18 9:49 AM, Alexey Kardashevskiy wrote: >> >> >> On 19/12/2018 14:40, Aneesh Kumar K.V wrote: >>> This helper does a get_user_pages_fast and if it find pages in the >>> CMA area >>> it will

Re: [PATCH V5 2/3] powerpc/mm/iommu: Allow migration of cma allocated pages during mm_iommu_get

2018-12-19 Thread Alexey Kardashevskiy
On 19/12/2018 14:40, Aneesh Kumar K.V wrote: > Current code doesn't do page migration if the page allocated is a compound > page. > With HugeTLB migration support, we can end up allocating hugetlb pages from > CMA region. Also THP pages can be allocated from CMA region. This patch > updates >

Re: [PATCH V5 1/3] mm: Add get_user_pages_cma_migrate

2018-12-19 Thread Alexey Kardashevskiy
On 19/12/2018 14:40, Aneesh Kumar K.V wrote: > This helper does a get_user_pages_fast and if it find pages in the CMA area > it will try to migrate them before taking page reference. This makes sure that > we don't keep non-movable pages (due to page reference count) in the CMA area. > Not able

Re: [PATCH V5 1/3] mm: Add get_user_pages_cma_migrate

2018-12-19 Thread Alexey Kardashevskiy
On 19/12/2018 14:40, Aneesh Kumar K.V wrote: > This helper does a get_user_pages_fast and if it find pages in the CMA area > it will try to migrate them before taking page reference. This makes sure that > we don't keep non-movable pages (due to page reference count) in the CMA area. > Not able

Re: [PATCH kernel v6 14/20] powerpc/powernv/npu: Add compound IOMMU groups

2018-12-19 Thread Alexey Kardashevskiy
On 19/12/2018 19:52, Alexey Kardashevskiy wrote: > At the moment the powernv platform registers an IOMMU group for each PE. > There is an exception though: an NVLink bridge which is attached to > the corresponding GPU's IOMMU group making it a master. > > Now we have POWE

Re: [PATCH kernel v5 14/20] powerpc/powernv/npu: Add compound IOMMU groups

2018-12-19 Thread Alexey Kardashevskiy
On 19/12/2018 21:00, Michael Ellerman wrote: > Alexey Kardashevskiy writes: >> On 19/12/2018 11:17, Michael Ellerman wrote: >>> Alexey Kardashevskiy writes: >>>> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c >>>> b/arch/powerpc/platforms/pow

[PATCH kernel v6.1 20/20 REPOST] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver

2018-12-19 Thread Alexey Kardashevskiy
o this relies on the platform to tell whether these GPUs have special abilities such as NVLinks. Signed-off-by: Alexey Kardashevskiy --- Changes: v6.1: * fixed outdated comment about VFIO_REGION_INFO_CAP_NVLINK2_LNKSPD v6: * reworked capabilities - tgt for nvlink and gpu and link-speed for nvlin

[PATCH kernel v6 18/20] vfio_pci: Allow mapping extra regions

2018-12-19 Thread Alexey Kardashevskiy
have 16GB RAM which is coherently mapped to the system address space, we are going to export these as an extra PCI region. We already support extra PCI regions and this adds support for mapping them to the userspace. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson Acked-by: Alex

[PATCH kernel v6 16/20] powerpc/powernv/npu: Check mmio_atsd array bounds when populating

2018-12-19 Thread Alexey Kardashevskiy
A broken device tree might contain more than 8 values and introduce hard to debug memory corruption bug. This adds the boundary check. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/npu-dma.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch

[PATCH kernel v6 12/20] powerpc/powernv/npu: Move single TVE handling to NPU PE

2018-12-19 Thread Alexey Kardashevskiy
PCI PE and cannot to NPU PE and if that fails, we could only set 32bit table to NPU PE and this configuration is not really supported or wanted. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/npu-dma.c | 8 +++ arch/powerpc/platforms/powernv/pci-i

[PATCH kernel v6 10/20] powerpc/iommu_api: Move IOMMU groups setup to a single place

2018-12-19 Thread Alexey Kardashevskiy
pci_controller_ops::setup_bridge (the normal way of adding PEs). Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson --- Changes: v5: * fixed compile with defined but not used pnv_ioda_setup_bus_iommu_group(); unfortunately defining a dummy version looks uglier than #ifdef --- arch

[PATCH kernel v6 20/20] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver

2018-12-19 Thread Alexey Kardashevskiy
o this relies on the platform to tell whether these GPUs have special abilities such as NVLinks. Signed-off-by: Alexey Kardashevskiy --- Changes: v6: * reworked capabilities - tgt for nvlink and gpu and link-speed for nvlink only v5: * do not memremap GPU RAM for emulation, map it only when it is need

[PATCH kernel v6 09/20] powerpc/powernv/pseries: Rework device adding to IOMMU groups

2018-12-19 Thread Alexey Kardashevskiy
table between partitionable endpoints), this removes iommu_table_group_link from pseries. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson --- arch/powerpc/include/asm/iommu.h | 12 ++--- arch/powerpc/kernel/iommu.c | 58 ++- arch/powerpc

[PATCH kernel v6 19/20] vfio_pci: Allow regions to add own capabilities

2018-12-19 Thread Alexey Kardashevskiy
VFIO regions already support region capabilities with a limited set of fields. However the subdriver might have to report to the userspace additional bits. This adds an add_capability() hook to vfio_pci_regops. Signed-off-by: Alexey Kardashevskiy Acked-by: Alex Williamson --- Changes: v3

[PATCH kernel v6 08/20] powerpc/pseries: Remove IOMMU API support for non-LPAR systems

2018-12-19 Thread Alexey Kardashevskiy
The pci_dma_bus_setup_pSeries and pci_dma_dev_setup_pSeries hooks are registered for the pseries platform which does not have FW_FEATURE_LPAR; these would be pre-powernv platforms which we never supported PCI pass through for anyway so remove it. Signed-off-by: Alexey Kardashevskiy Reviewed-by

[PATCH kernel v6 17/20] powerpc/powernv/npu: Fault user page into the hypervisor's pagetable

2018-12-19 Thread Alexey Kardashevskiy
scope tree does not get updated so ATS will still fail. This reads a byte from an effective address which causes HV storage interrupt and KVM updates the partition scope tree. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/npu-dma.c | 13 +++-- 1 file changed, 7

[PATCH kernel v6 15/20] powerpc/powernv/npu: Add release_ownership hook

2018-12-19 Thread Alexey Kardashevskiy
helper takes MSR (only DR/HV/PR/SF bits are allowed) to program them into NPU2 for ATS checkout requests support. This exports pnv_npu2_unmap_lpar_dev() as following patches will use it from the VFIO driver. Signed-off-by: Alexey Kardashevskiy --- Changes: v5: * removed opal_purge_cache as it is

[PATCH kernel v6 04/20] powerpc/powernv: Move npu struct from pnv_phb to pci_controller

2018-12-19 Thread Alexey Kardashevskiy
different data structure. This makes npu a pointer and stores it one level higher in the pci_controller struct. Signed-off-by: Alexey Kardashevskiy --- Changes: v5: * removed !npu checks as this is out of scope of this patch * added WARN_ON_ONCE in WARN_ON_ONCE(pnv_npu2_init(phb)) v4: * changed subj

[PATCH kernel v6 14/20] powerpc/powernv/npu: Add compound IOMMU groups

2018-12-19 Thread Alexey Kardashevskiy
ndlers. This moves IOMMU group registration for NVLink-connected GPUs to npu-dma.c. For POWER8, this stores a new compound group pointer in the PE (so a GPU is still a master); for POWER9 the new group pointer is stored in an NPU (which is allocated per a PCI host controller). Signed-o

[PATCH kernel v6 13/20] powerpc/powernv/npu: Convert NPU IOMMU helpers to iommu_table_group_ops

2018-12-19 Thread Alexey Kardashevskiy
. This makes a first step and converts an NPU PE with a set of extern function to a table group. This should cause no behavioral change. Note that pnv_npu_release_ownership() has never been implemented. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson --- arch/powerpc/platforms/powernv

[PATCH kernel v6 02/20] powerpc/mm/iommu/vfio_spapr_tce: Change mm_iommu_get to reference a region

2018-12-19 Thread Alexey Kardashevskiy
. This removes the check for exact match from mm_iommu_new() as we want it to fail on existing regions; mm_iommu_get() should be used instead. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson --- Changes: v5: * fixed a bug with uninitialized @found in tce_iommu_unregister_pages

[PATCH kernel v6 11/20] powerpc/powernv: Reference iommu_table while it is linked to a group

2018-12-19 Thread Alexey Kardashevskiy
The iommu_table pointer stored in iommu_table_group may get stale by accident, this adds referencing and removes a redundant comment about this. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson --- arch/powerpc/platforms/powernv/pci-ioda-tce.c | 3 ++- arch/powerpc/platforms

[PATCH kernel v6 07/20] powerpc/pseries/npu: Enable platform support

2018-12-19 Thread Alexey Kardashevskiy
We already changed NPU API for GPUs to not to call OPAL and the remaining bit is initializing NPU structures. This searches for POWER9 NVLinks attached to any device on a PHB and initializes an NPU structure if any found. Signed-off-by: Alexey Kardashevskiy --- Changes: v5: * added WARN_ON_ONCE

[PATCH kernel v6 06/20] powerpc/pseries/iommu: Use memory@ nodes in max RAM address calculation

2018-12-19 Thread Alexey Kardashevskiy
DMA so the DMA window needs to cover those memory regions too; if the window cannot cover new memory regions, the memory onlining fails. This walks through the memory nodes to find the highest RAM address to let a huge DMA window cover that too in case this memory gets onlined later. Signed

[PATCH kernel v6 05/20] powerpc/powernv/npu: Move OPAL calls away from context manipulation

2018-12-19 Thread Alexey Kardashevskiy
keeps pnv_npu2_map_lpar() in powernv as pseries is not allowed to call that. This exports pnv_npu2_map_lpar_dev() as following patches will use it from the VFIO driver. While at it, replace redundant list_for_each_entry_safe() with a simpler list_for_each_entry(). Signed-off-by: Alexey Kardashe

[PATCH kernel v6 03/20] powerpc/vfio/iommu/kvm: Do not pin device memory

2018-12-19 Thread Alexey Kardashevskiy
e of thise new regions which we must avoid unpinning of. This adds @mm to tce_page_is_contained() and iommu_tce_xchg() to test if the memory is device memory to avoid pfn_to_page(). This adds a check for device memory in mm_iommu_ua_mark_dirty_rm() which does delayed pages dirtying. Signed-off-

[PATCH kernel v6 01/20] powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2

2018-12-19 Thread Alexey Kardashevskiy
pci_disable_device() which switches NPU2 to a safe mode and prevents HMIs. Signed-off-by: Alexey Kardashevskiy Acked-by: Alistair Popple Reviewed-by: David Gibson --- Changes: v2: * updated the commit log --- arch/powerpc/platforms/powernv/pci-ioda.c | 10 ++ 1 file changed, 10

[PATCH kernel v6 00/20] powerpc/powernv/npu, vfio: NVIDIA V100 + P9 passthrough

2018-12-19 Thread Alexey Kardashevskiy
VFIO capabilities in 20/20. Please comment. Thanks. Alexey Kardashevskiy (20): powerpc/ioda/npu: Call skiboot's hot reset hook when disabling NPU2 powerpc/mm/iommu/vfio_spapr_tce: Change mm_iommu_get to reference a region powerpc/vfio/iommu/kvm: Do not pin device memory po

Re: [PATCH kernel v5 14/20] powerpc/powernv/npu: Add compound IOMMU groups

2018-12-18 Thread Alexey Kardashevskiy
On 19/12/2018 11:17, Michael Ellerman wrote: > Alexey Kardashevskiy writes: >> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c >> b/arch/powerpc/platforms/powernv/npu-dma.c >> index dc629ee..3468eaa 100644 >> --- a/arch/powerpc/platforms/powernv/npu-dma.c >

Re: [PATCH kernel v5 10/20] powerpc/iommu_api: Move IOMMU groups setup to a single place

2018-12-18 Thread Alexey Kardashevskiy
On 19/12/2018 10:35, Michael Ellerman wrote: > Alexey Kardashevskiy writes: > >> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c >> b/arch/powerpc/platforms/powernv/pci-ioda.c >> index b86a6e0..1168b185 100644 >> --- a/arch/powerpc/platforms/powernv/pc

Re: [RFC PATCH kernel] vfio/spapr_tce: Get rid of possible infinite loop

2018-12-18 Thread Alexey Kardashevskiy
On 08/10/2018 21:18, Michael Ellerman wrote: > Serhii Popovych writes: >> Alexey Kardashevskiy wrote: >>> As a part of cleanup, the SPAPR TCE IOMMU subdriver releases preregistered >>> memory. If there is a bug in memory release, the loop in >>> tce_io

Re: [PATCH kernel v5 20/20] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver

2018-12-18 Thread Alexey Kardashevskiy
On 19/12/2018 09:37, Alex Williamson wrote: > On Thu, 13 Dec 2018 17:17:34 +1100 > Alexey Kardashevskiy wrote: > >> POWER9 Witherspoon machines come with 4 or 6 V100 GPUs which are not >> pluggable PCIe devices but still have PCIe links which are used >> for config

[PATCH kernel v5 20/20] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver

2018-12-12 Thread Alexey Kardashevskiy
o this relies on the platform to tell whether these GPUs have special abilities such as NVLinks. Signed-off-by: Alexey Kardashevskiy --- Changes: v5: * do not memremap GPU RAM for emulation, map it only when it is needed * allocate 1 ATSD register per NVLink bridge, if none left, then expose the reg

[PATCH kernel v5 19/20] vfio_pci: Allow regions to add own capabilities

2018-12-12 Thread Alexey Kardashevskiy
VFIO regions already support region capabilities with a limited set of fields. However the subdriver might have to report to the userspace additional bits. This adds an add_capability() hook to vfio_pci_regops. Signed-off-by: Alexey Kardashevskiy Acked-by: Alex Williamson --- Changes: v3

[PATCH kernel v5 14/20] powerpc/powernv/npu: Add compound IOMMU groups

2018-12-12 Thread Alexey Kardashevskiy
ndlers. This moves IOMMU group registration for NVLink-connected GPUs to npu-dma.c. For POWER8, this stores a new compound group pointer in the PE (so a GPU is still a master); for POWER9 the new group pointer is stored in an NPU (which is allocated per a PCI host controller). Signed-o

[PATCH kernel v5 12/20] powerpc/powernv/npu: Move single TVE handling to NPU PE

2018-12-12 Thread Alexey Kardashevskiy
PCI PE and cannot to NPU PE and if that fails, we could only set 32bit table to NPU PE and this configuration is not really supported or wanted. Signed-off-by: Alexey Kardashevskiy --- arch/powerpc/platforms/powernv/npu-dma.c | 8 +++ arch/powerpc/platforms/powernv/pci-i

<    3   4   5   6   7   8   9   10   11   12   >