Alex,
This reserves space in get/set_one_reg ioctl for the extra guest state
needed for POWER8. It doesn't implement these at all, it just reserves
them so that the ABI is defined now.
A few things to note here:
- POWER8 has 6 PMCs and an additional 2 SPMCs for the supervisor. Here
I'm sto
On Thu, Aug 29, 2013 at 03:55:20PM -0700, Aaron Fabbri wrote:
> Has anyone considered a paravirt approach? That is:
>
> Guest kernel: Write a new IOMMU API back end which does KVM hypercalls.
> Exposes VFIO to guest user processes (nested VMs) as usual.
>
> Host kernel: KVM does things like c
None of its caller use its return value, so let it return void.
Signed-off-by: Jason Wang
---
drivers/vhost/net.c |5 ++---
1 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 969a859..280ee66 100644
--- a/drivers/vhost/net.c
+++ b/d
Hi all:
This series tries to unify and simplify vhost codes especially for
zerocopy. With this series, 5% - 10% improvement for per cpu throughput were
seen during netperf guest sending test.
Plase review.
Changes from V1:
- Fix the zerocopy enabling check by changing the check of upend_idx != d
We tend to batch the used adding and signaling in vhost_zerocopy_callback()
which may result more than 100 used buffers to be updated in
vhost_zerocopy_signal_used() in some cases. So wwitch to use
vhost_add_used_and_signal_n() to avoid multiple calls to
vhost_add_used_and_signal(). Which means muc
Let vhost_add_used() to use vhost_add_used_n() to reduce the code duplication.
Signed-off-by: Jason Wang
---
drivers/vhost/vhost.c | 54 ++--
1 files changed, 12 insertions(+), 42 deletions(-)
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost
Currently, even if the packet length is smaller than VHOST_GOODCOPY_LEN, if
upend_idx != done_idx we still set zcopy_used to true and rollback this choice
later. This could be avoided by determine zerocopy once by checking all
conditions at one time before.
Signed-off-by: Jason Wang
---
drivers/
We used to poll vhost queue before making DMA is done, this is racy if vhost
thread were waked up before marking DMA is done which can result the signal to
be missed. Fix this by always poll the vhost thread before DMA is done.
Signed-off-by: Jason Wang
---
drivers/vhost/net.c |9 +
As Michael point out, We used to limit the max pending DMAs to get better cache
utilization. But it was not done correctly since it was one done when there's no
new buffers submitted from guest. Guest can easily exceeds the limitation by
keeping sending packets.
So this patch moves the check into
Xiao's "KVM: MMU: flush tlb if the spte can be locklessly modified"
allows us to release mmu_lock before flushing TLBs.
Signed-off-by: Takuya Yoshikawa
Cc: Xiao Guangrong
---
Xiao can change the remaining mmu_lock to RCU's read-side lock:
The grace period will be reasonably limited.
arch/x86
Now that mmu_lock is held only inside kvm_mmu_write_protect_pt_masked(),
we can do __put_user() for copying each 64/32 dirty bits to user-space.
This eliminates the need to copy the whole bitmap to an extra buffer and
the resulting code is much more cache friendly than before.
Signed-off-by: Taku
I think this patch set answers Gleb's comment.
Takuya
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/25/2013 07:53 PM, Michael S. Tsirkin wrote:
> On Fri, Aug 23, 2013 at 04:55:49PM +0800, Jason Wang wrote:
>> On 08/20/2013 10:48 AM, Jason Wang wrote:
>>> On 08/16/2013 06:02 PM, Michael S. Tsirkin wrote:
> On Fri, Aug 16, 2013 at 01:16:30PM +0800, Jason Wang wrote:
>>> We used to lim
Hi Alex,
Second patch (kvm: ppc: booke: check range page invalidation progress on page
setup) of this patch series fixes a critical issue and we would like that to be
part of 2.12.
First Patch is not that important but pretty simple.
Thanks
-Bharat
> -Original Message-
> From: Bhushan
Sorry. Resending in plain text. (Gmail).
-- Forwarded message --
Has anyone considered a paravirt approach? That is:
Guest kernel: Write a new IOMMU API back end which does KVM
hypercalls. Exposes VFIO to guest user processes (nested VMs) as
usual.
Host kernel: KVM does thin
On Thu, Aug 29, 2013 at 11:08:23AM +0100, Marc Zyngier wrote:
> All the code in handle_mmio_cfg_reg() assumes the offset has
> been shifted right to accomodate for the 2:1 bit compression,
> but this is only done when getting the register addess.
address
>
> Shift the offset early so the code wo
Hi Qin,
On Mon, Aug 26, 2013 at 10:32 PM, Qin Chuanyu wrote:
> Hi all
>
> I am participating in a project which try to port vhost_net on Xen。
Neat!
> By change the memory copy and notify mechanism ,currently virtio-net with
> vhost_net could run on Xen with good performance。
I think the key in
On 29.08.2013, at 07:17, Paul Mackerras wrote:
> On Thu, Aug 29, 2013 at 12:56:40AM +0200, Alexander Graf wrote:
>>
>> On 06.08.2013, at 06:18, Paul Mackerras wrote:
>>
>>> #ifdef CONFIG_PPC_BOOK3S_64
>>> - /* default to book3s_64 (970fx) */
>>> + /*
>>> +* Default to the same as the ho
On 29.08.2013, at 07:04, Paul Mackerras wrote:
> On Thu, Aug 29, 2013 at 12:00:53AM +0200, Alexander Graf wrote:
>>
>> On 06.08.2013, at 06:16, Paul Mackerras wrote:
>>
>>> kvm_start_lightweight:
>>> + /* Copy registers into shadow vcpu so we can access them in real mode */
>>> + GET_SHADOW
On 29.08.2013, at 07:23, Paul Mackerras wrote:
> On Thu, Aug 29, 2013 at 01:24:04AM +0200, Alexander Graf wrote:
>>
>> On 06.08.2013, at 06:19, Paul Mackerras wrote:
>>
>>> +#ifdef CONFIG_PPC_64K_PAGES
>>> + /*
>>> +* Mark this as a 64k segment if the host is using
>>> +* 64k pages, t
On 08/29/2013 07:33 PM, Xiao Guangrong wrote:
> On 08/29/2013 05:31 PM, Gleb Natapov wrote:
>> On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Guangrong wrote:
>>> After more thinking, I still think rcu_assign_pointer() is unneeded when a
>>> entry
>>> is removed. The remove-API does not care the o
On 08/29/2013 05:31 PM, Gleb Natapov wrote:
> On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Guangrong wrote:
>> After more thinking, I still think rcu_assign_pointer() is unneeded when a
>> entry
>> is removed. The remove-API does not care the order between unlink the entry
>> and
>> the changes
On 08/29/2013 05:51 PM, Gleb Natapov wrote:
> On Thu, Aug 29, 2013 at 05:31:42PM +0800, Xiao Guangrong wrote:
>>> As Documentation/RCU/whatisRCU.txt says:
>>>
>>> As with rcu_assign_pointer(), an important function of
>>> rcu_dereference() is to document which pointers are protected
From: Paul Mackerras
Unlike the other general-purpose SPRs, SPRG3 can be read by usermode
code, and is used in recent kernels to store the CPU and NUMA node
numbers so that they can be read by VDSO functions. Thus we need to
load the guest's SPRG3 value into the real SPRG3 register when entering
From: Thadeu Lima de Souza Cascardo
err was overwritten by a previous function call, and checked to be 0. If
the following page allocation fails, 0 is going to be returned instead
of -ENOMEM.
Signed-off-by: Thadeu Lima de Souza Cascardo
Signed-off-by: Alexander Graf
---
arch/powerpc/kvm/book3
From: "Aneesh Kumar K.V"
Older version of power architecture use Real Mode Offset register and Real Mode
Limit
Selector for mapping guest Real Mode Area. The guest RMA should be physically
contigous since we use the range when address translation is not enabled.
This patch switch RMA allocation
From: Paul Mackerras
This corrects the usage of the tlbie (TLB invalidate entry) instruction
in HV KVM. The tlbie instruction changed between PPC970 and POWER7.
On the PPC970, the bit to select large vs. small page is in the instruction,
not in the RB register value. This changes the code to us
From: Scott Wood
kvm_guest_enter() was already called by kvmppc_prepare_to_enter().
Don't call it again.
Signed-off-by: Scott Wood
Signed-off-by: Alexander Graf
---
arch/powerpc/kvm/booke.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.
From: "Aneesh Kumar K.V"
Powerpc architecture uses a hash based page table mechanism for mapping virtual
addresses to physical address. The architecture require this hash page table to
be physically contiguous. With KVM on Powerpc currently we use early reservation
mechanism for allocating guest
From: Chen Gang
'rmls' is 'unsigned long', lpcr_rmls() will return negative number when
failure occurs, so it need a type cast for comparing.
'lpid' is 'unsigned long', kvmppc_alloc_lpid() return negative number
when failure occurs, so it need a type cast for comparing.
Signed-off-by: Chen Gang
From: Paul Mackerras
Commit 8e44ddc3f3 ("powerpc/kvm/book3s: Add support for H_IPOLL and
H_XIRR_X in XICS emulation") added a call to get_tb() but didn't
include the header that defines it, and on some configs this means
book3s_xics.c fails to compile:
arch/powerpc/kvm/book3s_xics.c: In function
From: "Aneesh Kumar K.V"
We want to use CMA for allocating hash page table and real mode area for
PPC64. Hence move DMA contiguous related changes into a seperate config
so that ppc64 can enable CMA without requiring DMA contiguous.
Acked-by: Michal Nazarewicz
Acked-by: Paul Mackerras
Signed-o
We don't emulate breakpoints yet, so just ignore reads and writes
to / from DABR.
This fixes booting of more recent Linux guest kernels for me.
Reported-by: Nello Martuscielli
Tested-by: Nello Martuscielli
Signed-off-by: Alexander Graf
---
arch/powerpc/kvm/book3s_emulate.c | 2 ++
1 file chan
From: Paul Mackerras
Currently the code assumes that once we load up guest FP/VSX or VMX
state into the CPU, it stays valid in the CPU registers until we
explicitly flush it to the thread_struct. However, on POWER7,
copy_page() and memcpy() can use VMX. These functions do flush the
VMX state to
From: Paul Mackerras
It turns out that if we exit the guest due to a hcall instruction (sc 1),
and the loading of the instruction in the guest exit path fails for any
reason, the call to kvmppc_ld() in kvmppc_get_last_inst() fetches the
instruction after the hcall instruction rather than the hcal
From: Paul Mackerras
This reworks kvmppc_mmu_book3s_64_xlate() to make it check the large
page bit in the hashed page table entries (HPTEs) it looks at, and
to simplify and streamline the code. The checking of the first dword
of each HPTE is now done with a single mask and compare operation,
and
From: "Aneesh Kumar K.V"
Both RMA and hash page table request will be a multiple of 256K. We can use
a chunk size of 256K to track the free/used 256K chunk in the bitmap. This
should help to reduce the bitmap size.
Signed-off-by: Aneesh Kumar K.V
Acked-by: Paul Mackerras
Signed-off-by: Alexand
From: Scott Wood
Currently this is only being done on 64-bit. Rather than just move it
out of the 64-bit ifdef, move it to kvm_lazy_ee_enable() so that it is
consistent with lazy ee state, and so that we don't track more host
code as interrupts-enabled than necessary.
Rename kvm_lazy_ee_enable(
From: Paul Mackerras
The table of offsets to real-mode hcall handlers in book3s_hv_rmhandlers.S
can contain negative values, if some of the handlers end up before the
table in the vmlinux binary. Thus we need to use a sign-extending load
to read the values in the table rather than a zero-extendi
From: "Aneesh Kumar K.V"
Otherwise we would clear the pvr value
Signed-off-by: Aneesh Kumar K.V
Signed-off-by: Alexander Graf
---
arch/powerpc/kvm/book3s_hv.c | 7 +++
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
Hi Paolo / Gleb,
This is my current patch queue for ppc. Please pull.
Changes include:
- Book3S HV: CMA based memory allocator for linear memory
- A few bug fixes
Alex
The following changes since commit cc2df20c7c4ce594c3e17e9cc260c330646012c8:
KVM: x86: Update symbolic exit codes (20
On Thu, 2013-08-29 at 07:10 +0200, Andreas Färber wrote:
> Am 29.08.2013 04:09, schrieb Chen Fan:
> > After ACPI get a signal to eject a vcpu, then it will notify
> > the vcpu thread of needing to exit, before the vcpu exiting,
> > will release the vcpu related objects.
> >
> > Signed-off-by: Chen
Gleb, Paolo,
Please pull the below tag for a few VGIC fixes to be merged in 3.12.
Thanks,
M.
The following changes since commit d8dfad3876e438b759da3c833d62fb8b2267:
Linux 3.11-rc7 (2013-08-25 17:43:22 -0700)
are available in the git repository at:
git://git.kernel.org/pub/sc
From: Christoffer Dall
For bytemaps each IRQ field is 1 byte wide, so we pack 4 irq fields in
one word and since there are 32 private (per cpu) irqs, we have 8
private u32 fields on the vgic_bytemap struct. We shift the offset from
the base of the register group right by 2, giving us the word in
From: Christoffer Dall
The Versatile Express TC2 board, which we use as our main emulated
platform in QEMU, defines 160+32 == 192 interrupts, so limiting the
number of interrupts to 128 is not quite going to cut it for real board
emulation.
Note that this didn't use to be a problem because QEMU
vgic_get_target_reg is quite complicated, for no good reason.
Actually, it is fairly easy to write it in a much more efficient
way by using the target CPU array instead of the bitmap.
Signed-off-by: Marc Zyngier
---
virt/kvm/arm/vgic.c | 12 +++-
1 file changed, 3 insertions(+), 9 deleti
All the code in handle_mmio_cfg_reg() assumes the offset has
been shifted right to accomodate for the 2:1 bit compression,
but this is only done when getting the register addess.
Shift the offset early so the code works mostly unchanged.
Reported-by: Zhaobo (Bob, ERC)
Signed-off-by: Marc Zyngier
On Thu, Aug 29, 2013 at 05:31:42PM +0800, Xiao Guangrong wrote:
> > As Documentation/RCU/whatisRCU.txt says:
> >
> > As with rcu_assign_pointer(), an important function of
> > rcu_dereference() is to document which pointers are protected by
> > RCU, in particular, flagging
On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Guangrong wrote:
> After more thinking, I still think rcu_assign_pointer() is unneeded when a
> entry
> is removed. The remove-API does not care the order between unlink the entry
> and
> the changes to its fields. It is the caller's responsibility:
On 08/29/2013 05:08 PM, Gleb Natapov wrote:
> On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Guangrong wrote:
> BTW I do not see
> rcu_assign_pointer()/rcu_dereference() in your patches which hints on
IIUC, We can not directly use rcu_assign_pointer(), that is something like:
On 08/29/2013 05:10 PM, Gleb Natapov wrote:
> On Tue, Jul 30, 2013 at 09:02:08PM +0800, Xiao Guangrong wrote:
>> It is easy if the handler is in the vcpu context, in that case we can use
>> walk_shadow_page_lockless_begin() and walk_shadow_page_lockless_end() that
>> disable interrupt to stop shado
On Tue, Jul 30, 2013 at 09:02:08PM +0800, Xiao Guangrong wrote:
> It is easy if the handler is in the vcpu context, in that case we can use
> walk_shadow_page_lockless_begin() and walk_shadow_page_lockless_end() that
> disable interrupt to stop shadow page be freed. But we are on the ioctl
> conte
On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Guangrong wrote:
> >>> BTW I do not see
> >>> rcu_assign_pointer()/rcu_dereference() in your patches which hints on
> >>
> >> IIUC, We can not directly use rcu_assign_pointer(), that is something like:
> >> p = v to assign a pointer to a pointer. But i
On Tue, Jul 30, 2013 at 09:01:59PM +0800, Xiao Guangrong wrote:
> @vcpu in page_fault_can_be_fast() is not used so remove it
>
> Signed-off-by: Xiao Guangrong
Applied this one. Thanks.
> ---
> arch/x86/kvm/mmu.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/
On Sat, Aug 03, 2013 at 02:09:43PM +0900, Takuya Yoshikawa wrote:
> On Tue, 30 Jul 2013 21:01:58 +0800
> Xiao Guangrong wrote:
>
> > Background
> > ==
> > Currently, when mark memslot dirty logged or get dirty page, we need to
> > write-protect large guest memory, it is the heavy work, es
55 matches
Mail list logo