On Mon, Mar 09, 2015 at 08:27:43PM +0100, Jan Kiszka wrote:
For a very long time (since 2b3d2a20), the path handling a vmmcall
instruction of the guest on an Intel host only applied the patch but no
longer handled the hypercall. The reverse case, vmcall on AMD hosts, is
fine. As both em_vmcall
On Mon, Mar 09, 2015 at 09:00:11PM +0100, Jan Kiszka wrote:
KVM tends to patch and emulated vmmcall on Intel. But that must not
happen for L2.
Signed-off-by: Jan Kiszka jan.kis...@siemens.com
Applied, thanks.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a
On Mon, Mar 09, 2015 at 08:56:43PM +0100, Jan Kiszka wrote:
While in L2, leave all #UD to L2 and do not try to emulate it. If L1 is
interested in doing this, it reports its interest via the exception
bitmap, and we never get into handle_exception of L0 anyway.
Signed-off-by: Jan Kiszka
On Wed, 2015-03-11 at 17:34 +1100, Gavin Shan wrote:
The patch defines PCI error types and functions in eeh.h and
exports function eeh_pe_inject_err(), which will be called by
VFIO driver to inject the specified PCI error to the indicated
PE for testing purpose.
Signed-off-by: Gavin Shan
On Wed, 2015-03-11 at 17:34 +1100, Gavin Shan wrote:
The patch adds one more EEH sub-command (VFIO_EEH_PE_INJECT_ERR)
to inject the specified EEH error, which is represented by
(struct vfio_eeh_pe_err), to the indicated PE for testing purpose.
Signed-off-by: Gavin Shan
On Thu, 2015-03-12 at 15:21 +1100, David Gibson wrote:
On Thu, Mar 12, 2015 at 02:16:42PM +1100, Gavin Shan wrote:
On Thu, Mar 12, 2015 at 11:57:21AM +1100, David Gibson wrote:
On Wed, Mar 11, 2015 at 05:34:11PM +1100, Gavin Shan wrote:
The patch adds one more EEH sub-command
There moves locked pages accounting to helpers.
Later they will be reused for Dynamic DMA windows (DDW).
This reworks debug messages to show the current value and the limit.
This stores the locked pages number in the container so when unlocking
the iommu table pointer won't be needed. This does
This replaces iommu_take_ownership()/iommu_release_ownership() calls
with the callback calls and it is up to the platform code to call
iommu_take_ownership()/iommu_release_ownership() if needed.
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
arch/powerpc/include/asm/iommu.h| 4 +--
The pnv_pci_ioda_tce_invalidate() helper invalidates TCE cache. It is
supposed to be called on IODA1/2 and not called on p5ioc2. It receives
start and end host addresses of TCE table. This approach makes it possible
to get pnv_pci_ioda_tce_invalidate() unintentionally called on p5ioc2.
Another
The iommu_free_table helper release memory it is using (the TCE table and
@it_map) and release the iommu_table struct as well. We might not want
the very last step as we store iommu_table in parent structures.
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
At the moment writing new TCE value to the IOMMU table fails with EBUSY
if there is a valid entry already. However PAPR specification allows
the guest to write new TCE value without clearing it first.
Another problem this patch is addressing is the use of pool locks for
external IOMMU users such
Normally a bitmap from the iommu_table is used to track what TCE entry
is in use. Since we are going to use iommu_table without its locks and
do xchg() instead, it becomes essential not to put bits which are not
implied in the direction flag.
This adds iommu_direction_to_tce_perm() (its
This extends iommu_table_group_ops by a set of callbacks to support
dynamic DMA windows management.
query() returns IOMMU capabilities such as default DMA window address and
supported number of DMA windows and TCE table levels.
create_table() creates a TCE table with specific parameters.
it
This is a part of moving TCE table allocation into an iommu_ops
callback to support multiple IOMMU groups per one VFIO container.
This enforce window size to be a power of two.
This is a pretty mechanical patch.
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
At the moment the iommu_table struct has a set_bypass() which enables/
disables DMA bypass on IODA2 PHB. This is exposed to POWERPC IOMMU code
which calls this callback when external IOMMU users such as VFIO are
about to get over a PHB.
The set_bypass() callback is not really an iommu_table
This patch fix the following sparse warning:
for file arch/x86/kvm/x86.c:
warning: Using plain integer as NULL pointer
Signed-off-by: Xiubo Li lixi...@cmss.chinamobile.com
---
arch/x86/kvm/x86.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/x86.c
Using the command like: 'make C=1 ', the sparse tool will complain
about warnings like:
warning: symbol 'XXX' was not declared. Should it be static?
warning: Using plain integer as NULL pointer
...
And also, if the symbols will only used locally, shouldn't it be static?
Xiubo Li (3):
This patch fix the following sparse warnings:
for file virt/kvm/kvm_main.c:
warning: symbol 'halt_poll_ns' was not declared. Should it be static?
Signed-off-by: Xiubo Li lixi...@cmss.chinamobile.com
---
virt/kvm/kvm_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
This patch fix the following sparse warnings:
for arch/x86/kvm/x86.c:
warning: symbol 'emulator_read_write' was not declared. Should it be static?
warning: symbol 'emulator_write_emulated' was not declared. Should it be static?
warning: symbol 'emulator_get_dr' was not declared. Should it be
Peter Maydell peter.mayd...@linaro.org writes:
On 12 March 2015 at 15:51, Peter Maydell peter.mayd...@linaro.org wrote:
On 4 March 2015 at 14:35, Alex Bennée alex.ben...@linaro.org wrote:
While observing KVM traces I can see additional IRQ calls on pretty much
every MMIO access which is just
This is a pretty mechanical patch to make next patches simpler.
New tce_iommu_unuse_page() helper does put_page() now but it might skip
that after the memory registering patch applied.
As we are here, this removes unnecessary checks for a value returned
by pfn_to_page() as it cannot possibly
This moves iommu_table creation to the beginning. This is a mechanical
patch.
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
arch/powerpc/platforms/powernv/pci-ioda.c | 30 --
1 file changed, 16 insertions(+), 14 deletions(-)
diff --git
The existing implementation accounts the whole DMA window in
the locked_vm counter which is going to be even worse with multiple
containers and huge DMA windows.
This introduces 2 ioctls to register/unregister DMA memory which
receive user space address and size of a memory region which
needs to
This adds missing locks in iommu_take_ownership()/
iommu_release_ownership().
This marks all pages busy in iommu_table::it_map in order to catch
errors if there is an attempt to use this table while ownership over it
is taken.
This only clears TCE content if there is no page marked busy in
On Fri, Mar 13, 2015 at 10:24:21AM +, Andre Przywara wrote:
Hej Christoffer,
On 02/03/15 17:29, Christoffer Dall wrote:
On Fri, Feb 27, 2015 at 07:41:45PM +0800, weiyj...@163.com wrote:
From: Wei Yongjun yongjun_...@trendmicro.com.cn
Add the missing unlock before return from
Peter Maydell peter.mayd...@linaro.org writes:
On 4 March 2015 at 14:35, Alex Bennée alex.ben...@linaro.org wrote:
This adds the saving and restore of the current Multi-Processing state
of the machine. While the KVM_GET/SET_MP_STATE API exposes a number of
potential states for x86 we only
This moves page pinning (get_user_pages_fast()/put_page()) code out of
the platform IOMMU code and puts it to VFIO IOMMU driver where it belongs
to as the platform code does not deal with page pinning.
This makes iommu_take_ownership()/iommu_release_ownership() deal with
the IOMMU table bitmap
This makes use of the it_page_size from the iommu_table struct
as page size can differ.
This replaces missing IOMMU_PAGE_SHIFT macro in commented debug code
as recently introduced IOMMU_PAGE_XXX macros do not include
IOMMU_PAGE_SHIFT.
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
This clears the TCE table when a container is being closed as this is
a good thing to leave the table clean before passing the ownership
back to the host kernel.
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
drivers/vfio/vfio_iommu_spapr_tce.c | 14 +++---
1 file changed, 11
This is to make extended ownership and multiple groups support patches
simpler for review.
This is a mechanical patch.
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
drivers/vfio/vfio_iommu_spapr_tce.c | 38 ++---
1 file changed, 23 insertions(+), 15
Modern IBM POWERPC systems support multiple (currently two) TCE tables
per IOMMU group (a.k.a. PE). This adds a iommu_table_group container
for TCE tables. Right now just one table is supported.
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
arch/powerpc/include/asm/iommu.h|
Before the IOMMU user (VFIO) would take control over the IOMMU table
belonging to a specific IOMMU group. This approach did not allow sharing
tables between IOMMU groups attached to the same container.
This introduces a new IOMMU ownership flavour when the user can not
just control the existing
TCE tables might get too big in case of 4K IOMMU pages and DDW enabled
on huge guests (hundreds of GB of RAM) so the kernel might be unable to
allocate contiguous chunk of physical memory to store the TCE table.
To address this, POWER8 CPU (actually, IODA2) supports multi-level TCE tables,
up to
The existing IOMMU requires VFIO_IOMMU_ENABLE call to enable actual use
of the container (i.e. call DMA map/unmap) and this is where we check
the rlimit for locked pages. It assumes that only as much memory
as a default DMA window can be mapped. Every DMA map/unmap request will
do
This is a part of moving DMA window programming to an iommu_ops
callback.
This is a mechanical patch.
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
arch/powerpc/platforms/powernv/pci-ioda.c | 85 ---
1 file changed, 56 insertions(+), 29 deletions(-)
diff
This changes few functions to receive a iommu_table_group pointer
rather than PE as they are going to be a part of upcoming
iommu_table_group_ops callback set.
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
arch/powerpc/platforms/powernv/pci-ioda.c | 13 -
1 file changed, 8
This replaces multiple calls of kzalloc_node() with a new
iommu_table_alloc() helper. Right now it calls kzalloc_node() but
later it will be modified to allocate a iommu_table_group struct with
a single iommu_table in it.
Later the helper will allocate a iommu_table_group struct which embeds
the
At the moment DMA map/unmap requests are handled irrespective to
the container's state. This allows the user space to pin memory which
it might not be allowed to pin.
This adds checks to MAP/UNMAP that the container is enabled, otherwise
-EPERM is returned.
Signed-off-by: Alexey Kardashevskiy
At the moment only one group per container is supported.
POWER8 CPUs have more flexible design and allows naving 2 TCE tables per
IOMMU group so we can relax this limitation and support multiple groups
per container.
This adds TCE table descriptors to a container and uses iommu_table_group_ops
to
This checks that the TCE table page size is not bigger that the size of
a page we just pinned and going to put its physical address to the table.
Otherwise the hardware gets unwanted access to physical memory between
the end of the actual page and the end of the aligned up TCE page.
Since
This enables sPAPR defined feature called Dynamic DMA windows (DDW).
Each Partitionable Endpoint (IOMMU group) has an address range on a PCI bus
where devices are allowed to do DMA. These ranges are called DMA windows.
By default, there is a single DMA window, 1 or 2GB big, mapped at zero
on a
Hej Christoffer,
On 02/03/15 17:29, Christoffer Dall wrote:
On Fri, Feb 27, 2015 at 07:41:45PM +0800, weiyj...@163.com wrote:
From: Wei Yongjun yongjun_...@trendmicro.com.cn
Add the missing unlock before return from function kvm_vgic_create()
in the error handling case.
Signed-off-by: Wei
This adds a iommu_table_ops struct and puts pointer to it into
the iommu_table struct. This moves tce_build/tce_free/tce_get/tce_flush
callbacks from ppc_md to the new struct where they really belong to.
This adds the requirement for @it_ops to be initialized before calling
iommu_init_table() to
This adds create/remove window ioctls to create and remove DMA windows.
sPAPR defines a Dynamic DMA windows capability which allows
para-virtualized guests to create additional DMA windows on a PCI bus.
The existing linux kernels use this new window to map the entire guest
memory and switch to the
Kære Webmail bruger,
Din postkasse har overskredet lagergrænsen, der er 20 GB som angivet af
administratoren, du kører på 33,6 GB, skal du gentage ægthedsbekræftelsen af
postkassen klik eller kopiere linket nedenfor:
http://opdatering.wix.com/opdatering
Advarsel: manglende evne til at
On 03/13/2015 08:39 AM, Radim Krčmář wrote:
...
The warning message is very clever:
- it contains the magical may qualifier and being protected only by
RH=1 creates weird-looking code structure, but it is technically right
1) lowest-priority delivery may be set in msi.data, which avoids
2015-03-12 21:08-0600, James Sullivan:
---
Changes since v2:
* Added one time warning message when RH=1
* Documented conflict between RH=1 and delivery mode
* Tidied code to check RH=1/DM=1 (remove bool phys, use if/else)
Changes since v3:
* Fixed logical error in RH=1/DM=1
On Wed, Mar 04, 2015 at 02:31:56PM +0800, Wincy Van wrote:
In commit 3af18d9c5fe9 (KVM: nVMX: Prepare for using hardware MSR bitmap),
we are setting MSR_BITMAP in prepare_vmcs02 if we should use hardware. This
is not enough since the field will be modified by following vmx_set_efer.
Fix this
On Fri, Mar 06, 2015 at 02:44:35PM -0600, Joel Schopp wrote:
From: David Kaplan david.kap...@amd.com
Another patch in my war on emulate_on_interception() use as a svm exit
handler.
These were pulled out of a larger patch at the suggestion of Radim Krcmar, see
2015-03-13 08:47-0600, James Sullivan:
On 03/13/2015 08:39 AM, Radim Krčmář wrote:
...
The warning message is very clever:
- it contains the magical may qualifier and being protected only by
RH=1 creates weird-looking code structure, but it is technically right
1) lowest-priority
Perfect, thanks for the feedback. I'll get v5 out shortly.
On 03/13/2015 09:08 AM, Radim Krčmář wrote:
2015-03-13 08:47-0600, James Sullivan:
On 03/13/2015 08:39 AM, Radim Krčmář wrote:
...
The warning message is very clever:
- it contains the magical may qualifier and being protected only
This patch adds a check for RH=1 in kvm_set_msi_irq. Currently the
DM bit is the only thing used to decide irq-dest_mode (logical when DM
set, physical when unset). Documentation indicates that the DM bit will
be 'ignored' when the RH bit is unset, and physical destination mode is
used in this
Am 13.03.2015 um 10:39 schrieb Xiubo Li:
This patch fix the following sparse warnings:
for file virt/kvm/kvm_main.c:
warning: symbol 'halt_poll_ns' was not declared. Should it be static?
Signed-off-by: Xiubo Li lixi...@cmss.chinamobile.com
---
virt/kvm/kvm_main.c | 2 +-
1 file
So, it could be the i40e driver then ? Because IIUC, VFs use a separate
driver. Just to rule out the possibility that there might be some driver
fixes that
could help with this, it might be a good idea to try a 3.19 or later upstream
kernel.
I tried with the latest DPDK release too
2015-03-13 09:14-0600, James Sullivan:
---
Changes since v2:
* Added one time warning message when RH=1
* Documented conflict between RH=1 and delivery mode
* Tidied code to check RH=1/DM=1 (remove bool phys, use if/else)
Changes since v3:
* Fixed logical error in RH=1/DM=1
Currently we use a lot of VGIC specific code to do the MMIO
dispatching.
Use the previous reworks to add kvm_io_bus style MMIO handlers.
Those are not yet called by the MMIO abort handler, also the actual
VGIC emulator function do not make use of it yet, but will be enabled
with the following
virt/kvm was never really a good include directory for anything else
than locally included headers.
With the move of iodev.h there is no need anymore to add this
directory the compiler's include path, so remove it from the x86 kvm
Makefile.
Signed-off-by: Andre Przywara andre.przyw...@arm.com
---
The vgic_find_range() function in vgic.c takes a struct kvm_exit_mmio
argument, but actually only used the length field in there. Since we
need to get rid of that structure in that part of the code anyway,
let's rework the function (and it's callers) to pass the length
argument to the function
iodev.h contains definitions for the kvm_io_bus framework. This is
needed both by the generic KVM code in virt/kvm as well as by
architecture specific code under arch/. Putting the header file in
virt/kvm and using local includes in the architecture part seems at
least dodgy to me, so let's move
In kvm_destroy_vm() we call kvm_io_bus_destroy() pretty early,
especially before calling kvm_arch_destroy_vm(). To avoid
unregistering devices from the already destroyed bus, let's mark
the bus with NULL to let other users know it has been destroyed
already.
This avoids a crash on a VM shutdown
From: Nikolay Nikolaev n.nikol...@virtualopensystems.com
This is needed in e.g. ARM vGIC emulation, where the MMIO handling
depends on the VCPU that does the access.
Signed-off-by: Nikolay Nikolaev n.nikol...@virtualopensystems.com
Signed-off-by: Andre Przywara andre.przyw...@arm.com
Acked-by:
From: Nikolay Nikolaev n.nikol...@virtualopensystems.com
On IO memory abort, try to handle the MMIO access through the KVM
registered read/write callbacks. This is done by invoking the relevant
kvm_io_bus_* API.
[Andre: Since we converted the VGIC already, we can get rid of the
VGIC specific
virt/kvm was never really a good include directory for anything else
than locally included headers.
With the move of iodev.h there is no need anymore to add this
directory the compiler's include path, so remove it from the arm and
arm64 kvm Makefile.
Signed-off-by: Andre Przywara
With all of the virtual GIC emulation code now being registered with
the kvm_io_bus, we can remove all of the old MMIO handling code and
its dispatching functionality.
Signed-off-by: Andre Przywara andre.przyw...@arm.com
---
include/kvm/arm_vgic.h |2 --
virt/kvm/arm/vgic-v2-emul.c |
Using the framework provided by the recent vgic.c changes we register
a kvm_io_bus device when initializing the virtual GICv2.
Signed-off-by: Andre Przywara andre.przyw...@arm.com
---
include/kvm/arm_vgic.h |1 +
virt/kvm/arm/vgic-v2-emul.c | 13 +
virt/kvm/arm/vgic.c
Using the framework provided by the recent vgic.c changes, we
register a kvm_io_bus device on mapping the virtual GICv3 resources.
The distributor mapping is pretty straight forward, but the
redistributors need some more love, since they need to be tagged with
the respective redistributor (read:
This series converts the VGIC MMIO handling routines to the generic
kvm_io_bus framework. The framework is needed for the ioeventfd
functionality, some people on the list wanted to see the VGIC
converted over to use it, too.
Looking at the diffstat it doesn't look too useful after all, but
it
The name kvm_mmio_range is a bit bold, given that it only covers
the VGIC's MMIO ranges. To avoid confusion with kvm_io_range, rename
it to vgic_io_range.
Signed-off-by: Andre Przywara andre.przyw...@arm.com
---
virt/kvm/arm/vgic-v2-emul.c |6 +++---
virt/kvm/arm/vgic-v3-emul.c |8
GCC 5.0.0 enables raw strings by default and they have higher priority
than macros, thus R[...] is interpreted incorrectly:
lib/x86/isr.c:112:30: error: invalid character ')' in raw string delimiter
lib/x86/isr.c:112:8: error: stray ‘R’ in program
lib/x86/isr.c:112:26: error: expected ‘:’
Some fairly minor updates from the last series I sent out:
- Re-based on v4.0-rc3
- Use KVM_MP_STATE_STOPPED instead of KVM_MP_STATE_HALTED
- Some minor textual tidy ups on commit msgs and comments
- Move the re-factored vgic_queue_irq_to_lr to before the active changes
The branch I've
From: Christoffer Dall christoffer.d...@linaro.org
When a VCPU is no longer running, we currently check to see if it has a
timer scheduled in the future, and if it does, we schedule a host
hrtimer to notify is in case the timer expires while the VCPU is still
not running. When the hrtimer fires,
This helps re-factor away some of the repetitive code and makes the code
flow more nicely.
Signed-off-by: Alex Bennée alex.ben...@linaro.org
---
v3
- Move to before the un-queue active patch
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 0cc6ab6..6682d58 100644
---
To cleanly restore an SMP VM we need to ensure that the current pause
state of each vcpu is correctly recorded. Things could get confused if
the CPU starts running after migration restore completes when it was
paused before it state was captured.
We use the existing KVM_GET/SET_MP_STATE ioctl to
From: Christoffer Dall christoffer.d...@linaro.org
Migrating active interrupts causes the active state to be lost
completely. This implements some additional bitmaps to track the active
state on the distributor and export this to user space.
Signed-off-by: Christoffer Dall
From: Christoffer Dall christoffer.d...@linaro.org
There is an interesting bug in the vgic code, which manifests itself
when the KVM run loop has a signal pending or needs a vmid generation
rollover after having disabled interrupts but before actually switching
to the guest.
In this case, we
75 matches
Mail list logo