On Tue, Nov 24, 2015 at 09:35:26PM +0800, Lan Tianyu wrote:
> This patch is to add SRIOV VF migration support.
> Create new device type "vfio-sriov" and add faked PCI migration capability
> to the type device.
>
> The purpose of the new capability
> 1) sync migration status with VF driver in the
On Tue, Nov 24, 2015 at 09:38:18PM +0800, Lan Tianyu wrote:
> This patch is to add migration support for ixgbevf driver. Using
> faked PCI migration capability table communicates with Qemu to
> share migration status and mailbox irq vector index.
>
> Qemu will notify VF via sending MSIX msg to
> -Original Message-
> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
> Sent: Tuesday, November 24, 2015 10:38 PM
> To: Radim Krcmár ; Wu, Feng
> Cc: kvm@vger.kernel.org; linux-ker...@vger.kernel.org
> Subject: Re: [PATCH] KVM: x86: Add
> -Original Message-
> From: Radim Krčmář [mailto:rkrc...@redhat.com]
> Sent: Tuesday, November 24, 2015 10:32 PM
> To: Wu, Feng
> Cc: pbonz...@redhat.com; kvm@vger.kernel.org; linux-
> ker...@vger.kernel.org
> Subject: Re: [PATCH] KVM: x86: Add lowest-priority
On 2015年11月24日 22:20, Alexander Duyck wrote:
> I'm still not a fan of this approach. I really feel like this is
> something that should be resolved by extending the existing PCI hot-plug
> rather than trying to instrument this per driver. Then you will get the
> goodness for multiple drivers and
On Tue, Nov 24, 2015 at 7:18 PM, Lan Tianyu wrote:
> On 2015年11月24日 22:20, Alexander Duyck wrote:
>> I'm still not a fan of this approach. I really feel like this is
>> something that should be resolved by extending the existing PCI hot-plug
>> rather than trying to
On Tue, Nov 24, 2015 at 1:20 PM, Michael S. Tsirkin wrote:
> On Tue, Nov 24, 2015 at 09:38:18PM +0800, Lan Tianyu wrote:
>> This patch is to add migration support for ixgbevf driver. Using
>> faked PCI migration capability table communicates with Qemu to
>> share migration status
On 2015年11月25日 05:20, Michael S. Tsirkin wrote:
> I have to say, I was much more interested in the idea
> of tracking dirty memory. I have some thoughts about
> that one - did you give up on it then?
No, our finial target is to keep VF active before doing
migration and tracking dirty memory is
Hi all:
This series tries to add basic busy polling for vhost net. The idea is
simple: at the end of tx/rx processing, busy polling for new tx added
descriptor and rx receive socket for a while. The maximum number of
time (in us) could be spent on busy polling was specified ioctl.
Test A were
This path introduces a helper which can give a hint for whether or not
there's a work queued in the work list.
Signed-off-by: Jason Wang
---
drivers/vhost/vhost.c | 7 +++
drivers/vhost/vhost.h | 1 +
2 files changed, 8 insertions(+)
diff --git a/drivers/vhost/vhost.c
This patch tries to poll for new added tx buffer or socket receive
queue for a while at the end of tx/rx processing. The maximum time
spent on polling were specified through a new kind of vring ioctl.
Signed-off-by: Jason Wang
---
drivers/vhost/net.c| 72
Signed-off-by: Jason Wang
---
drivers/vhost/vhost.c | 26 +-
drivers/vhost/vhost.h | 1 +
2 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 163b365..b86c5aa 100644
---
Marc Zyngier writes:
> Implement the core of the world switch in C. Not everything is there
> yet, and there is nothing to re-enter the world switch either.
>
> But this already outlines the code structure well enough.
>
> Signed-off-by: Marc Zyngier
We were probing the physial distributor state for the active state of a
HW virtual IRQ, because we had seen evidence that the LR state was not
cleared when the guest deactivated a virtual interrupted.
However, this issue turned out to be a software bug in the GIC, which
was solved by:
From: Marc Zyngier
When running a 32bit guest under a 64bit hypervisor, the ARMv8
architecture defines a mapping of the 32bit registers in the 64bit
space. This includes banked registers that are being demultiplexed
over the 64bit ones.
On exceptions caused by an operation
We were incorrectly removing the active state from the physical
distributor on the timer interrupt when the timer output level was
deasserted. We shouldn't be doing this without considering the virtual
interrupt's active state, because the architecture requires that when an
LR has the HW bit set
From: Marc Zyngier
Cortex-A57 parts up to r1p2 can misreport Stage 2 translation faults
when a Stage 1 permission fault or device alignment fault should
have been reported.
This patch implements the workaround (which is to validate that the
Stage-1 translation actually
From: Ard Biesheuvel
The open coded tests for checking whether a PTE maps a page as
uncached use a flawed '(pte_val(xxx) & CONST) != CONST' pattern,
which is not guaranteed to work since the type of a mapping is
not a set of mutually exclusive bits
For HYP mappings,
From: Mark Rutland
If we call __kvm_hyp_panic while a guest context is active, we call
__restore_sysregs before acquiring the system register values for the
panic, in the process throwing away the PAR_EL1 value at the point of
the panic.
This patch modifies __kvm_hyp_panic
Hi Paolo,
Here's a set of fixes for KVM/ARM for v4.4-rc3 based on v4.4-rc2, because the
errata fixes don't apply on v4.4-rc1. Let me know if you can pull this anyhow.
Thanks,
-Christoffer
The following changes since commit 1ec218373b8ebda821aec00bb156a9c94fad9cd4:
Linux 4.4-rc2 (2015-11-22
From: Mark Rutland
Currently __kvm_hyp_panic uses %p for values which are not pointers,
such as the ESR value. This can confusingly lead to "(null)" being
printed for the value.
Use %x instead, and only use %p for host pointers.
Signed-off-by: Mark Rutland
We were setting the physical active state on the GIC distributor in a
preemptible section, which could cause us to set the active state on
different physical CPU from the one we were actually going to run on,
hacoc ensues.
Since we are no longer descheduling/scheduling soft timers in the
On Tue, 24 Nov 2015 17:29:14 +
Alex Bennée wrote:
>
> Marc Zyngier writes:
>
> > Implement the core of the world switch in C. Not everything is there
> > yet, and there is nothing to re-enter the world switch either.
> >
> > But this already
On 24/11/2015 18:35, Christoffer Dall wrote:
> Hi Paolo,
>
> Here's a set of fixes for KVM/ARM for v4.4-rc3 based on v4.4-rc2, because the
> errata fixes don't apply on v4.4-rc1. Let me know if you can pull this
> anyhow.
Sure, pulled.
Paolo
> Thanks,
> -Christoffer
>
> The following
On Tue, Nov 24, 2015 at 02:36:20PM -0200, Eduardo Habkost wrote:
> KVM_X86_SET_MCE does not call kvm_vcpu_ioctl_x86_setup_mce(). It
> calls kvm_vcpu_ioctl_x86_set_mce(), which stores the
> IA32_MCi_{STATUS,ADDR,MISC} register contents at
> vcpu->arch.mce_banks.
Ah, correct. I've mistakenly
On Mon, Nov 23, 2015 at 05:43:14PM +0100, Borislav Petkov wrote:
> On Mon, Nov 23, 2015 at 01:11:27PM -0200, Eduardo Habkost wrote:
> > On Mon, Nov 23, 2015 at 11:22:37AM -0200, Eduardo Habkost wrote:
> > [...]
> > > In the case of this code, it looks like it's already broken
> > > because the
On Tue, 24 Nov 2015 16:44:00 +0100
Christoffer Dall wrote:
> We were probing the physial distributor state for the active state of a
> HW virtual IRQ, because we had seen evidence that the LR state was not
> cleared when the guest deactivated a virtual interrupted.
>
On Mon, Nov 16, 2015 at 10:28:16AM +, Marc Zyngier wrote:
> Here's a couple of fixes for KVM/arm64:
>
> - The first one addresses a misinterpretation of the architecture
> spec, leading to the mishandling of I/O accesses generated from an
> AArch32 guest using banked registers.
>
> - The
Hi Pavel,
[auto build test ERROR on kvm/linux-next]
[also build test ERROR on v4.4-rc2 next-20151124]
url:
https://github.com/0day-ci/linux/commits/Pavel-Fedin/KVM-arm64-Implement-API-for-vGICv3-live-migration/20151124-171812
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next
On Tue, 24 Nov 2015 16:43:59 +0100
Christoffer Dall wrote:
> We were incorrectly removing the active state from the physical
> distributor on the timer interrupt when the timer output level was
> deasserted. We shouldn't be doing this without considering the virtual
mmio_data_read() and mmio_data_write(), originally used in this function,
are limited only to 32 bits. We are going to refactor this code and
eventually let it do 64-bit I/O for vGICv3. Therefore, our first step is
to get rid of this limitation.
We open up these inlines, which consist of
Access size is always 64 bits. Since CPU interface state actually affects
only a single vCPU, no vGIC locking is done in order to avoid code
duplication. Just made sure that the vCPU is not running.
Signed-off-by: Pavel Fedin
---
arch/arm64/include/uapi/asm/kvm.h | 14 ++-
In order to implement vGICv3 CPU interface access, we will need to
perform table lookup of system registers. We would need both
index_to_params() and find_reg() exported for that purpose, but instead
we export a single function which combines them both.
Signed-off-by: Pavel Fedin
The access is done similar to vGICv2, using
KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_REDIST_REGS
with KVM_SET_DEVICE_ATTR and KVM_GET_DEVICE_ATTR ioctls.
Access size for vGICv3 is 64 bits, vgic_attr_regs_access() fixed to
support this. The trick with vgic_v3_get_reg_size() is
This patchset adds necessary userspace API in order to support vGICv3 live
migration. GICv3 registers are accessed using device attribute ioctls,
similar to GICv2.
v5 => v6:
- Rebased on top of linux-next of 23.11.2015
- Use original API documentation patch, with minor changes only.
- Quit
From: Christoffer Dall
Factor out the GICv3-specific documentation into a separate
documentation file. Add description for how to access distributor,
redistributor, and CPU interface registers for GICv3 in this new file.
Acked-by: Peter Maydell
Replace Rt with data pointer in struct sys_reg_params. This will allow to
reuse system register handling code in implementation of vGICv3 CPU
interface access API. Additionally, got rid of "massive hack"
in kvm_handle_cp_64().
Signed-off-by: Pavel Fedin
---
Separate all implementation-independent code in vgic_attr_regs_access()
and move it to vgic.c. This will allow to reuse this code for vGICv3
implementation.
vcpu lookup is left where it originally was, because vGICv3 API will
expect affinity ID instead of vCPU index, therefore it will be done
On 11/24/2015 05:38 AM, Lan Tianyu wrote:
This patchset is to propose a solution of adding live migration
support for SRIOV NIC.
During migration, Qemu needs to let VF driver in the VM to know
migration start and end. Qemu adds faked PCI migration capability
to help to sync status between two
2015-11-24 01:26+, Wu, Feng:
> "I don't think we do any vector hashing on our client parts. This may be why
> the customer is not able to detect this on Skylake client silicon.
> The vector hashing is micro-architectural and something we had done on server
> parts.
>
> If you look at the
On 24/11/2015 15:35, Radim Krcmár wrote:
> > Thanks for your guys' review. Yes, we can introduce a module option
> > for it. According to Radim's comments above, we need use the
> > same policy for PI and non-PI lowest-priority interrupts, so here is the
> > question: for vector hashing, it is
2015-11-24 15:31+0100, Radim Krčmář:
> 000 means that bits 7:4 of vector are selected, thus the vector hash is
> 0b1110 = 14, so the round-robin effectively does 14 % 4 (because we only
> have 4 destinations) and delivers to the 3rd possible APIC (= ID 6)?
Ah, 3rd APIC in the set has ID 4, of
2015-11-24 01:26+, Wu, Feng:
>> From: Paolo Bonzini [mailto:pbonz...@redhat.com]
>> On 16/11/2015 20:03, Radim Krčmář wrote:
>> > 2015-11-09 10:46+0800, Feng Wu:
>> >> Use vector-hashing to handle lowest-priority interrupts for
>> >> posted-interrupts. As an example, modern Intel CPUs use this
On 11/24/2015 05:44 AM, Paolo Bonzini wrote:
On 23/11/2015 18:11, Estrada, Zachary J wrote:
I'm playing around with EPTs and kvm to track execution in the guest.
I've created a separate set of EPTs (and copied the last level entries
from the real tables, minus execute permissions) but I'm not
On 24/11/2015 15:51, Estrada, Zachary J wrote:
> 2) Got it. Let's say I want to work with a copy of the extended page
> tables instead of the original, what would be the best way to do so?
Why would you want that? It's difficult to give an answer without
understanding what you're doing.
Juan Quintela wrote:
> Hi
>
> Please, send any topic that you are interested in covering.
>
> At the end of Monday I will send an email with the agenda or the
> cancellation of the call, so hurry up.
>
> After discussions on the QEMU Summit, we are going to have always open a
On 23/11/2015 18:11, Estrada, Zachary J wrote:
> I'm playing around with EPTs and kvm to track execution in the guest.
> I've created a separate set of EPTs (and copied the last level entries
> from the real tables, minus execute permissions) but I'm not getting
> exits where I expect. I also
Hi
Please, send any topic that you are interested in covering.
At the end of Monday I will send an email with the agenda or the
cancellation of the call, so hurry up.
After discussions on the QEMU Summit, we are going to have always an open
KVM call where you can add topics.
Call details:
By
Our new web mail has been improved with a new messaging system from
Owa/outlook which also include faster usage on email, shared calendar,
web-documents and the new 2015 anti-spam version. Please use the link below to
complete your update for our new Owa/outlook improved web mail. CLICK
This patch is to add migration support for ixgbevf driver. Using
faked PCI migration capability table communicates with Qemu to
share migration status and mailbox irq vector index.
Qemu will notify VF via sending MSIX msg to trigger mailbox
vector during migration and store migration status in
This patch is to add new ioctl cmd VFIO_GET_PCI_CAP_INFO to get
PCI cap table size and get free PCI config space regs according
pos and size.
Qemu will add faked PCI capability for migration and need such
info.
Signed-off-by: Lan Tianyu
---
drivers/vfio/pci/vfio_pci.c
This patch is to extend PCI CAP id for migration cap and
add reg macros. The CAP ID is trial and we may find better one if the
solution is feasible.
*PCI_VF_MIGRATION_CAP
For VF driver to control that triggers mailbox irq or not during migration.
*PCI_VF_MIGRATION_VMM_STATUS
Qemu stores
This patch is to add SRIOV VF migration support.
Create new device type "vfio-sriov" and add faked PCI migration capability
to the type device.
The purpose of the new capability
1) sync migration status with VF driver in the VM
2) Get mailbox irq vector to notify VF driver during migration.
3)
Signed-off-by: Lan Tianyu
---
hw/vfio/pci.c | 6 --
1 file changed, 6 deletions(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index e7583b5..404a5cd 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3625,11 +3625,6 @@ static Property vfio_pci_dev_properties[] = {
Signed-off-by: Lan Tianyu
---
hw/vfio/pci.c | 137 +-
hw/vfio/pci.h | 158 ++
2 files changed, 159 insertions(+), 136 deletions(-)
create mode 100644 hw/vfio/pci.h
diff
This patchset is to propose a solution of adding live migration
support for SRIOV NIC.
During migration, Qemu needs to let VF driver in the VM to know
migration start and end. Qemu adds faked PCI migration capability
to help to sync status between two sides during migration.
Qemu triggers VF's
Use new ioctl cmd VFIO_GET_PCI_CAP_INFO to get PCI cap table size.
This helps to get accurate table size and faciliate to find free
PCI config space regs for faked PCI capability. Current code assigns
PCI config space regs from the start of last PCI capability table to
pos 0xff to the last
Signed-off-by: Lan Tianyu
---
linux-headers/linux/vfio.h | 16
1 file changed, 16 insertions(+)
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 0508d0b..732b0bd 100644
--- a/linux-headers/linux/vfio.h
+++
These new functions allow direct mapping and unmapping of addresses on the
given IOMMU. They will be used for mapping MSI hardware.
Signed-off-by: Pavel Fedin
---
drivers/vfio/vfio_iommu_type1.c | 29 +
include/linux/vfio.h| 4 +++-
After migration, Qemu needs to trigger mailbox irq to notify VF driver
in the guest about status change. The irq delivery restarts to work after
restoring CPU state. This patch is to add new callback to run after
restoring CPU state and provide a way to trigger mailbox irq later.
Signed-off-by:
On some architectures (e.g. ARM64) if the device is behind an IOMMU, and
is being mapped by VFIO, it is necessary to also add mappings for MSI
translation register for interrupts to work. This series implements the
necessary API to do this, and makes use of this API for GICv3 ITS on
ARM64.
v1 =>
This patchset is to propose a solution of adding live migration
support for SRIOV NIC.
During migration, Qemu needs to let VF driver in the VM to know
migration start and end. Qemu adds faked PCI migration capability
to help to sync status between two sides during migration.
Qemu triggers VF's
This patch is to add a callback which is called just before stopping VCPU.
It's for VF migration to trigger mailbox irq as later as possible to
decrease service downtime.
Signed-off-by: Lan Tianyu
---
include/migration/vmstate.h | 3 +++
include/sysemu/sysemu.h | 1 +
Hi Pavel,
[auto build test ERROR on tip/irq/core]
[also build test ERROR on v4.4-rc2 next-20151124]
url:
https://github.com/0day-ci/linux/commits/Pavel-Fedin/Introduce-MSI-hardware-mapping-for-VFIO/20151124-155050
config: powerpc-allmodconfig (attached as .config)
reproduce:
wget
These new functions use the supplied IOMMU in order to map and unmap MSI
translation register(s).
Signed-off-by: Pavel Fedin
---
drivers/irqchip/irq-gic-v3-its.c | 31 +++
include/linux/irqchip/arm-gic-v3.h | 2 ++
include/linux/msi.h
Signed-off-by: Lan Tianyu
---
hw/vfio/pci.c | 6 +++---
hw/vfio/pci.h | 4
2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index d0354a0..7c43fc1 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -613,7 +613,7 @@ static void
This patch is to add ioctl wrap to find free PCI config sapce regs.
Signed-off-by: Lan Tianyu
---
hw/vfio/pci.c | 19 +++
hw/vfio/pci.h | 2 ++
2 files changed, 21 insertions(+)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 29845e3..d0354a0 100644
---
This patch is to extend PCI CAP id for migration cap and
add reg macros. The CAP ID is trial and we may find better one if the
solution is feasible.
*PCI_VF_MIGRATION_CAP
For VF driver to control that triggers mailbox irq or not during migration.
*PCI_VF_MIGRATION_VMM_STATUS
Qemu stores
These operations are used in order to map and unmap MSI translation
registers for the device, allowing it to send MSIs to the host while being
mapped via IOMMU.
Usage of MSI controllers is tracked on a per-device basis using reference
counting. An MSI controller remains mapped as long as there's
This little series addresses some problems we've been observing with the
arch timer. First, we were fiddling with a PPI timer interrupt outside
of a preemptible section, which is bad for obvious reasons. Second, we
were clearing the physical active state when we shouldn't. Third, we
can
We were probing the physial distributor state for the active state of a
HW virtual IRQ, because we had seen evidence that the LR state was not
cleared when the guest deactivated a virtual interrupted.
However, this issue turned out to be a software bug in the GIC, which
was solved by:
We were setting the physical active state on the GIC distributor in a
preemptible section, which could cause us to set the active state on
different physical CPU from the one we were actually going to run on,
hacoc ensues.
Since we are no longer descheduling/scheduling soft timers in the
We were incorrectly removing the active state from the physical
distributor on the timer interrupt when the timer output level was
deasserted. We shouldn't be doing this without considering the virtual
interrupt's active state, because the architecture requires that when an
LR has the HW bit set
On Tue, 24 Nov 2015 16:43:58 +0100
Christoffer Dall wrote:
> We were setting the physical active state on the GIC distributor in a
> preemptible section, which could cause us to set the active state on
> different physical CPU from the one we were actually going to
On 11/24/2015 09:13 AM, Paolo Bonzini wrote:
On 24/11/2015 15:51, Estrada, Zachary J wrote:
2) Got it. Let's say I want to work with a copy of the extended page
tables instead of the original, what would be the best way to do so?
Why would you want that? It's difficult to give an answer
On 24/11/2015 16:52, Estrada, Zachary J wrote:
>> I'm not sure if this is your problem, but perhaps you want to record in
>> the role whether the page comes from your version or the original? The
>> role is like the hash key, if the role is the same you get the same PTE.
>
> This is extremely
76 matches
Mail list logo