Re: [Qemu-devel] [PATCH 19/23] userfaultfd: activate syscall

2015-09-07 Thread Bharata B Rao
On Tue, Sep 08, 2015 at 04:08:06PM +1000, Michael Ellerman wrote:
> On Wed, 2015-08-12 at 10:53 +0530, Bharata B Rao wrote:
> > On Tue, Aug 11, 2015 at 03:48:26PM +0200, Andrea Arcangeli wrote:
> > > Hello Bharata,
> > > 
> > > On Tue, Aug 11, 2015 at 03:37:29PM +0530, Bharata B Rao wrote:
> > > > May be it is a bit late to bring this up, but I needed the following fix
> > > > to userfault21 branch of your git tree to compile on powerpc.
> > > 
> > > Not late, just in time. I increased the number of syscalls in earlier
> > > versions, it must have gotten lost during a rejecting rebase, sorry.
> > > 
> > > I applied it to my tree and it can be applied to -mm and linux-next,
> > > thanks!
> > > 
> > > The syscall for arm32 are also ready and on their way to the arm tree,
> > > the testsuite worked fine there. ppc also should work fine if you
> > > could confirm it'd be interesting, just beware that I got a typo in
> > > the testcase:
> > 
> > The testsuite passes on powerpc.
> > 
> > 
> > running userfaultfd
> > 
> > nr_pages: 2040, nr_pages_per_cpu: 170
> > bounces: 31, mode: rnd racing ver poll, userfaults: 80 43 23 23 15 16 12 1 
> > 2 96 13 128
> > bounces: 30, mode: racing ver poll, userfaults: 35 54 62 49 47 48 2 8 0 78 
> > 1 0
> > bounces: 29, mode: rnd ver poll, userfaults: 114 153 70 106 78 57 143 92 
> > 114 96 1 0
> > bounces: 28, mode: ver poll, userfaults: 96 81 5 45 83 19 98 28 1 145 23 2
> > bounces: 27, mode: rnd racing poll, userfaults: 54 65 60 54 45 49 1 2 1 2 
> > 71 20
> > bounces: 26, mode: racing poll, userfaults: 90 83 35 29 37 35 30 42 3 4 49 6
> > bounces: 25, mode: rnd poll, userfaults: 52 50 178 112 51 41 23 42 18 99 59 > > 0
> > bounces: 24, mode: poll, userfaults: 136 101 83 260 84 29 16 88 1 6 160 57
> > bounces: 23, mode: rnd racing ver, userfaults: 141 197 158 183 39 49 3 52 8 
> > 3 6 0
> > bounces: 22, mode: racing ver, userfaults: 242 266 244 180 162 32 87 43 31 
> > 40 34 0
> > bounces: 21, mode: rnd ver, userfaults: 636 158 175 24 253 104 48 8 0 0 0 0
> > bounces: 20, mode: ver, userfaults: 531 204 225 117 129 107 11 143 76 31 1 0
> > bounces: 19, mode: rnd racing, userfaults: 303 169 225 145 59 219 37 0 0 0 
> > 0 0
> > bounces: 18, mode: racing, userfaults: 374 372 37 144 126 90 25 12 15 17 0 0
> > bounces: 17, mode: rnd, userfaults: 313 412 134 108 80 99 7 56 85 0 0 0
> > bounces: 16, mode:, userfaults: 431 58 87 167 120 113 98 60 14 8 48 0
> > bounces: 15, mode: rnd racing ver poll, userfaults: 41 40 25 28 37 24 0 0 0 
> > 0 180 75
> > bounces: 14, mode: racing ver poll, userfaults: 43 53 30 28 25 15 19 0 0 0 
> > 0 30
> > bounces: 13, mode: rnd ver poll, userfaults: 136 91 114 91 92 79 114 77 75 
> > 68 1 2
> > bounces: 12, mode: ver poll, userfaults: 92 120 114 76 153 75 132 157 83 81 
> > 10 1
> > bounces: 11, mode: rnd racing poll, userfaults: 50 72 69 52 53 48 46 59 57 
> > 51 37 1
> > bounces: 10, mode: racing poll, userfaults: 33 49 38 68 35 63 57 49 49 47 
> > 25 10
> > bounces: 9, mode: rnd poll, userfaults: 167 150 67 123 39 75 1 2 9 125 1 1
> > bounces: 8, mode: poll, userfaults: 147 102 20 87 5 27 118 14 104 40 21 28
> > bounces: 7, mode: rnd racing ver, userfaults: 305 254 208 74 59 96 36 14 11 
> > 7 4 5
> > bounces: 6, mode: racing ver, userfaults: 290 114 191 94 162 114 34 6 6 32 
> > 23 2
> > bounces: 5, mode: rnd ver, userfaults: 370 381 22 273 21 106 17 55 0 0 0 0
> > bounces: 4, mode: ver, userfaults: 328 279 179 191 74 86 95 15 13 10 0 0
> > bounces: 3, mode: rnd racing, userfaults: 222 215 164 70 5 20 179 0 34 3 0 0
> > bounces: 2, mode: racing, userfaults: 316 385 112 160 225 5 30 49 42 2 4 0
> > bounces: 1, mode: rnd, userfaults: 273 139 253 176 163 71 85 2 0 0 0 0
> > bounces: 0, mode:, userfaults: 165 212 633 13 24 66 24 27 15 0 10 1
> > [PASS]
> 
> Hmm, not for me. See below.
> 
> What setup were you testing on Bharata?

I was on commit a94572f5799dd of userfault21 branch in Andrea's tree
git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git

#uname -a
Linux 4.1.0-rc8+ #1 SMP Tue Aug 11 11:33:50 IST 2015 ppc64le ppc64le ppc64le 
GNU/Linux

In fact I had successfully done postcopy migration of sPAPR guest with
this setup.

> 
> Mine is:
> 
> $ uname -a
> Linux lebuntu 4.2.0-09705-g3a166acc1432 #2 SMP Tue Sep 8 15:18:00 AEST 2015 
> ppc64le ppc64le ppc64le GNU/Linux
> 
> Which is 7d9071a09502 plus a couple of powerpc patches.
> 
> $ zgrep USERFAULTFD /proc/config.gz
> CONFIG_USERFAULTFD=y
> 
> $ sudo ./userfaultfd 128 32
> nr_pages: 2048, nr_pages_per_cpu: 128
> bounces: 31, mode: rnd racing ver poll, error mutex 2 2
> error mutex 2 10

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 19/23] userfaultfd: activate syscall

2015-09-07 Thread Michael Ellerman
On Wed, 2015-08-12 at 10:53 +0530, Bharata B Rao wrote:
> On Tue, Aug 11, 2015 at 03:48:26PM +0200, Andrea Arcangeli wrote:
> > Hello Bharata,
> > 
> > On Tue, Aug 11, 2015 at 03:37:29PM +0530, Bharata B Rao wrote:
> > > May be it is a bit late to bring this up, but I needed the following fix
> > > to userfault21 branch of your git tree to compile on powerpc.
> > 
> > Not late, just in time. I increased the number of syscalls in earlier
> > versions, it must have gotten lost during a rejecting rebase, sorry.
> > 
> > I applied it to my tree and it can be applied to -mm and linux-next,
> > thanks!
> > 
> > The syscall for arm32 are also ready and on their way to the arm tree,
> > the testsuite worked fine there. ppc also should work fine if you
> > could confirm it'd be interesting, just beware that I got a typo in
> > the testcase:
> 
> The testsuite passes on powerpc.
> 
> 
> running userfaultfd
> 
> nr_pages: 2040, nr_pages_per_cpu: 170
> bounces: 31, mode: rnd racing ver poll, userfaults: 80 43 23 23 15 16 12 1 2 
> 96 13 128
> bounces: 30, mode: racing ver poll, userfaults: 35 54 62 49 47 48 2 8 0 78 1 0
> bounces: 29, mode: rnd ver poll, userfaults: 114 153 70 106 78 57 143 92 114 
> 96 1 0
> bounces: 28, mode: ver poll, userfaults: 96 81 5 45 83 19 98 28 1 145 23 2
> bounces: 27, mode: rnd racing poll, userfaults: 54 65 60 54 45 49 1 2 1 2 71 
> 20
> bounces: 26, mode: racing poll, userfaults: 90 83 35 29 37 35 30 42 3 4 49 6
> bounces: 25, mode: rnd poll, userfaults: 52 50 178 112 51 41 23 42 18 99 59 0
> bounces: 24, mode: poll, userfaults: 136 101 83 260 84 29 16 88 1 6 160 57
> bounces: 23, mode: rnd racing ver, userfaults: 141 197 158 183 39 49 3 52 8 3 
> 6 0
> bounces: 22, mode: racing ver, userfaults: 242 266 244 180 162 32 87 43 31 40 
> 34 0
> bounces: 21, mode: rnd ver, userfaults: 636 158 175 24 253 104 48 8 0 0 0 0
> bounces: 20, mode: ver, userfaults: 531 204 225 117 129 107 11 143 76 31 1 0
> bounces: 19, mode: rnd racing, userfaults: 303 169 225 145 59 219 37 0 0 0 0 0
> bounces: 18, mode: racing, userfaults: 374 372 37 144 126 90 25 12 15 17 0 0
> bounces: 17, mode: rnd, userfaults: 313 412 134 108 80 99 7 56 85 0 0 0
> bounces: 16, mode:, userfaults: 431 58 87 167 120 113 98 60 14 8 48 0
> bounces: 15, mode: rnd racing ver poll, userfaults: 41 40 25 28 37 24 0 0 0 0 
> 180 75
> bounces: 14, mode: racing ver poll, userfaults: 43 53 30 28 25 15 19 0 0 0 0 
> 30
> bounces: 13, mode: rnd ver poll, userfaults: 136 91 114 91 92 79 114 77 75 68 
> 1 2
> bounces: 12, mode: ver poll, userfaults: 92 120 114 76 153 75 132 157 83 81 
> 10 1
> bounces: 11, mode: rnd racing poll, userfaults: 50 72 69 52 53 48 46 59 57 51 
> 37 1
> bounces: 10, mode: racing poll, userfaults: 33 49 38 68 35 63 57 49 49 47 25 
> 10
> bounces: 9, mode: rnd poll, userfaults: 167 150 67 123 39 75 1 2 9 125 1 1
> bounces: 8, mode: poll, userfaults: 147 102 20 87 5 27 118 14 104 40 21 28
> bounces: 7, mode: rnd racing ver, userfaults: 305 254 208 74 59 96 36 14 11 7 
> 4 5
> bounces: 6, mode: racing ver, userfaults: 290 114 191 94 162 114 34 6 6 32 23 
> 2
> bounces: 5, mode: rnd ver, userfaults: 370 381 22 273 21 106 17 55 0 0 0 0
> bounces: 4, mode: ver, userfaults: 328 279 179 191 74 86 95 15 13 10 0 0
> bounces: 3, mode: rnd racing, userfaults: 222 215 164 70 5 20 179 0 34 3 0 0
> bounces: 2, mode: racing, userfaults: 316 385 112 160 225 5 30 49 42 2 4 0
> bounces: 1, mode: rnd, userfaults: 273 139 253 176 163 71 85 2 0 0 0 0
> bounces: 0, mode:, userfaults: 165 212 633 13 24 66 24 27 15 0 10 1
> [PASS]

Hmm, not for me. See below.

What setup were you testing on Bharata?

Mine is:

$ uname -a
Linux lebuntu 4.2.0-09705-g3a166acc1432 #2 SMP Tue Sep 8 15:18:00 AEST 2015 
ppc64le ppc64le ppc64le GNU/Linux

Which is 7d9071a09502 plus a couple of powerpc patches.

$ zgrep USERFAULTFD /proc/config.gz
CONFIG_USERFAULTFD=y

$ sudo ./userfaultfd 128 32
nr_pages: 2048, nr_pages_per_cpu: 128
bounces: 31, mode: rnd racing ver poll, error mutex 2 2
error mutex 2 10
error mutex 2 15
error mutex 2 21
error mutex 2 22
error mutex 2 27
error mutex 2 36
error mutex 2 39
error mutex 2 40
error mutex 2 41
error mutex 2 43
error mutex 2 75
error mutex 2 79
error mutex 2 83
error mutex 2 100
error mutex 2 108
error mutex 2 110
error mutex 2 114
error mutex 2 119
error mutex 2 120
error mutex 2 135
error mutex 2 137
error mutex 2 141
error mutex 2 142
error mutex 2 144
error mutex 2 145
error mutex 2 150
error mutex 2 151
error mutex 2 159
error mutex 2 161
error mutex 2 169
error mutex 2 172
error mutex 2 174
error mutex 2 175
error mutex 2 176
error mutex 2 178
error mutex 2 188
error mutex 2 194
error mutex 2 208
error mutex 2 210
error mutex 2 212
error mutex 2 220
error mutex 2 223
error mutex 2 224
error mutex 2 226
error mutex 2 236
error mutex 2 249
error mutex 2 252
error mutex 2 255
error mutex 2 256
error mutex 2 267
error mutex 2 277
error mutex 2 284
error mutex 2 295
error mutex 2 302
e

Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-07 Thread Thomas Huth
On 07/09/15 16:31, Igor Mammedov wrote:
> On Fri, 4 Sep 2015 12:04:41 +0200
> Alexander Graf  wrote:
> 
>>
>>
>> On 04.09.15 11:59, Christian Borntraeger wrote:
>>> Am 04.09.2015 um 11:35 schrieb Thomas Huth:

  Hi all,

 now that we get memory hotplugging for the spapr machine on qemu-ppc,
 too, it seems like we easily can hit the amount of KVM-internal memory
 slots now ("#define KVM_USER_MEM_SLOTS 32" in
 arch/powerpc/include/asm/kvm_host.h). For example, start
 qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
 4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
 you'll see that it aborts way earlier already.

 The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
 already (+3 internal slots = 512) ... maybe we should now increase the
 amount of slots on powerpc, too? Since we don't use internal slots on
 POWER, would 512 be a good value? Or would less be sufficient, too?
>>>
>>> When you are at it, the s390 value should also be increased I guess.
>>
>> That constant defines the array size for the memslot array in struct kvm
>> which in turn again gets allocated by kzalloc, so it's pinned kernel
>> memory that is physically contiguous. Doing big allocations can turn
>> into problems during runtime.
>>
>> So maybe there is another way? Can we extend the memslot array size
>> dynamically somehow? Allocate it separately? How much memory does the
>> memslot array use up with 512 entries?
> 
> KVM switched memslots allocation to kvm_kvzalloc(), so it would fallback to 
> vmalloc
>  commit 744961341d472db6272ed9b42319a90f5a2aa7c4
>  kvm: avoid page allocation failure in kvm_set_memory_region()

Good hint, thanks for pointing that out! ... so increasing the array
size should not cause too much trouble :-)

 Thomas


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 104091] [bisected] Starting a VM causes the host to halt and create Machine Check Exceptions

2015-09-07 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=104091

Michael Long  changed:

   What|Removed |Added

 Regression|No  |Yes

--- Comment #3 from Michael Long  ---
mcelog: Family 6 Model 3f CPU: only decoding architectural errors
Hardware event. This is not a software error.
MCE 0
CPU 0 BANK 17 
MISC 4f0083884501086 ADDR fa000200 
TIME 1441663568 Tue Sep  8 00:06:08 2015
MCG status:
MCi status:
Uncorrected error
Error enabled
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: corrected filtering (some unreported errors in same region)
Generic CACHE Level-2 Generic Error
STATUS be23110a MCGSTATUS 0
MCGCAP 7000c16 APICID 0 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 63

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: irqchip: fix memory leak

2015-09-07 Thread Paolo Bonzini


On 02/09/2015 09:03, Sudip Mukherjee wrote:
> We were taking the exit path after checking ue->flags and return value
> of setup_routing_entry(), but 'e' was not freed incase of a failure.
> 
> Signed-off-by: Sudip Mukherjee 
> ---
>  virt/kvm/irqchip.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
> index 21c1424..c63e54f 100644
> --- a/virt/kvm/irqchip.c
> +++ b/virt/kvm/irqchip.c
> @@ -213,11 +213,15 @@ int kvm_set_irq_routing(struct kvm *kvm,
>   goto out;
>  
>   r = -EINVAL;
> - if (ue->flags)
> + if (ue->flags) {
> + kfree(e);
>   goto out;
> + }
>   r = setup_routing_entry(new, e, ue);
> - if (r)
> + if (r) {
> + kfree(e);
>   goto out;
> + }
>   ++ue;
>   }
>  
> 

Queued, thanks.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] arm/arm64: KVM: Add forwarded physical interrupts documentation

2015-09-07 Thread Marc Zyngier
On 07/09/15 17:45, Eric Auger wrote:
> Hi Christoffer,
> On 09/04/2015 09:40 PM, Christoffer Dall wrote:
>> Forwarded physical interrupts on arm/arm64 is a tricky concept and the
>> way we deal with them is not apparently easy to understand by reading
>> various specs.
>>
>> Therefore, add a proper documentation file explaining the flow and
>> rationale of the behavior of the vgic.
>>
>> Some of this text was contributed by Marc Zyngier and edited by me.
>> Omissions and errors are all mine.
>>
>> Signed-off-by: Christoffer Dall 
>> ---
>>  Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 181 
>> +
>>  1 file changed, 181 insertions(+)
>>  create mode 100644 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
>>
>> diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt 
>> b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
>> new file mode 100644
>> index 000..24b6f28
>> --- /dev/null
>> +++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
>> @@ -0,0 +1,181 @@
>> +KVM/ARM VGIC Forwarded Physical Interrupts
>> +==

[...]

>> +1.  KVM runs the VCPU
>> +2.  The guest programs the time to fire in T+100
>> +4.  At T+100 the timer fires and a physical IRQ causes the VM to exit
>> +5.  With interrupts disabled on the CPU, KVM looks at the timer state
>> +and injects a forwarded physical interrupt because it concludes the
>> +timer has expired.
> I don't get how we can trap without the virtual timer PPI handler being
> entered on host side. Please can you elaborate on this?

On VM exit, we disable the virtual timer (see the code in
hyp.S::save_timer_state where we clear the enable bit). We still perform
the exit, but the cause for exit is now gone, and the handler will never
fire.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] arm/arm64: KVM: Add forwarded physical interrupts documentation

2015-09-07 Thread Eric Auger
Hi Christoffer,
On 09/04/2015 09:40 PM, Christoffer Dall wrote:
> Forwarded physical interrupts on arm/arm64 is a tricky concept and the
> way we deal with them is not apparently easy to understand by reading
> various specs.
> 
> Therefore, add a proper documentation file explaining the flow and
> rationale of the behavior of the vgic.
> 
> Some of this text was contributed by Marc Zyngier and edited by me.
> Omissions and errors are all mine.
> 
> Signed-off-by: Christoffer Dall 
> ---
>  Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 181 
> +
>  1 file changed, 181 insertions(+)
>  create mode 100644 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> 
> diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt 
> b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> new file mode 100644
> index 000..24b6f28
> --- /dev/null
> +++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> @@ -0,0 +1,181 @@
> +KVM/ARM VGIC Forwarded Physical Interrupts
> +==
> +
> +The KVM/ARM code implements software support for the ARM Generic
> +Interrupt Controller's (GIC's) hardware support for virtualization by
> +allowing software to inject virtual interrupts to a VM, which the guest
> +OS sees as regular interrupts.  The code is famously known as the VGIC.
> +
> +Some of these virtual interrupts, however, correspond to physical
> +interrupts from real physical devices.  One example could be the
> +architected timer, which itself supports virtualization, and therefore
> +lets a guest OS program the hardware device directly to raise an
> +interrupt at some point in time.  When such an interrupt is raised, the
> +host OS initially handles the interrupt and must somehow signal this
> +event as a virtual interrupt to the guest.  Another example could be a
> +passthrough device, where the physical interrupts are initially handled
> +by the host, but the device driver for the device lives in the guest OS
> +and KVM must therefore somehow inject a virtual interrupt on behalf of
> +the physical one to the guest OS.
> +
> +These virtual interrupts corresponding to a physical interrupt on the
> +host are called forwarded physical interrupts, but are also sometimes
> +referred to as 'virtualized physical interrupts' and 'mapped interrupts'.
> +
> +Forwarded physical interrupts are handled slightly differently compared
> +to virtual interrupts generated purely by a software emulated device.
> +
> +
> +The HW bit
> +--
> +Virtual interrupts are signalled to the guest by programming the List
> +Registers (LRs) on the GIC before running a VCPU.  The LR is programmed
> +with the virtual IRQ number and the state of the interrupt (Pending,
> +Active, or Pending+Active).  When the guest ACKs and EOIs a virtual
> +interrupt, the LR state moves from Pending to Active, and finally to
> +inactive.
> +
> +The LRs include an extra bit, called the HW bit.  When this bit is set,
> +KVM must also program an additional field in the LR, the physical IRQ
> +number, to link the virtual with the physical IRQ.
> +
> +When the HW bit is set, KVM must EITHER set the Pending OR the Active
> +bit, never both at the same time.
> +
> +Setting the HW bit causes the hardware to deactivate the physical
> +interrupt on the physical distributor when the guest deactivates the
> +corresponding virtual interrupt.
> +
> +
> +Forwarded Physical Interrupts Life Cycle
> +
> +
> +The state of forwarded physical interrupts is managed in the following way:
> +
> +  - The physical interrupt is acked by the host, and becomes active on
> +the physical distributor (*).
> +  - KVM sets the LR.Pending bit, because this is the only way the GICV
> +interface is going to present it to the guest.
> +  - LR.Pending will stay set as long as the guest has not acked the 
> interrupt.
> +  - LR.Pending transitions to LR.Active on the guest read of the IAR, as
> +expected.
> +  - On guest EOI, the *physical distributor* active bit gets cleared,
> +but the LR.Active is left untouched (set).
> +  - KVM clears the LR when on VM exits when the physical distributor
s/when//?
> +active state has been cleared.
> +
> +(*): The host handling is slightly more complicated.  For some devices
> +(shared), KVM directly sets the active state on the physical distributor
> +before entering the guest, and for some devices (non-shared) the host
> +configures the GIC such that it does not deactivate the interrupt on
> +host EOIs, but only performs a priority drop allowing the GIC to receive
> +other interrupts and leaves the interrupt in the active state on the
> +physical distributor.
EOIMode == 1 is set globally and impacts all forwarded SPI/PPIs, shared
or not shared I think. reading the above lines I have the impression
this is a per-device programming.

My understanding is for the timer it is needed to manually set the
physical distributor state because 1) the

Re: [PATCH] virtio-blk: use VIRTIO_BLK_F_WCE and VIRTIO_BLK_F_CONFIG_WCE in virtio1

2015-09-07 Thread Paolo Bonzini


On 22/08/2015 00:53, Paolo Bonzini wrote:
> VIRTIO_BLK_F_CONFIG_WCE is important in order to achieve good performance
> (up to 2x, though more realistically +30-40%) in latency-bound workloads.
> However, it was removed by mistake together with VIRTIO_BLK_F_FLUSH.
> 
> It will be restored in the next revision of the virtio 1.0 standard, so
> do the same in Linux.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  drivers/block/virtio_blk.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index d4d05f064d39..ea2c17c66dfb 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -478,8 +478,7 @@ static int virtblk_get_cache_mode(struct virtio_device 
> *vdev)
>  struct virtio_blk_config, wce,
>  &writeback);
>   if (err)
> - writeback = virtio_has_feature(vdev, VIRTIO_BLK_F_WCE) ||
> - virtio_has_feature(vdev, VIRTIO_F_VERSION_1);
> + writeback = virtio_has_feature(vdev, VIRTIO_BLK_F_WCE);
>  
>   return writeback;
>  }
> @@ -840,7 +839,7 @@ static unsigned int features_legacy[] = {
>  static unsigned int features[] = {
>   VIRTIO_BLK_F_SEG_MAX, VIRTIO_BLK_F_SIZE_MAX, VIRTIO_BLK_F_GEOMETRY,
>   VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE,
> - VIRTIO_BLK_F_TOPOLOGY,
> + VIRTIO_BLK_F_WCE, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_CONFIG_WCE,
>   VIRTIO_BLK_F_MQ,
>  };
>  
> 

Ping?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] arm/arm64: KVM: vgic: Move active state handling to flush_hwstate

2015-09-07 Thread Marc Zyngier
On 07/09/15 15:44, Eric Auger wrote:
> Hi,
> On 09/04/2015 04:24 PM, Christoffer Dall wrote:
>> We currently set the physical active state only when we *inject* a new
>> pending virtual interrupt, but this is actually not correct, because we
>> could have been preempted and run something else on the system that
>> resets the active state to clear.  This causes us to run the VM with the
>> timer set to fire, but without setting the physical active state.
>>
>> The solution is to always check the LR configurations, and we if have a
>> mapped interrupt in the LR in either the pending or active state
>> (virtual), then set the physical active state.
>>
>> Acked-by: Marc Zyngier 
>> Signed-off-by: Christoffer Dall 
>> ---
>>  virt/kvm/arm/vgic.c | 42 ++
>>  1 file changed, 26 insertions(+), 16 deletions(-)
>>
[...]
>> +
>> +for (lr = 0; lr < vgic->nr_lr; lr++) {
>> +struct vgic_lr vlr;
>> +
>> +if (!test_bit(lr, vgic_cpu->lr_used))
>> +continue;
>> +
>> +vlr = vgic_get_lr(vcpu, lr);
>> +
>> +/*
>> + * If we have a mapping, and the virtual interrupt is
>> + * presented to the guest (as pending or active), then we must
>> + * set the state to active in the physical world. See
>> + * Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt.
> if upstreamed in 4.3 whereas the other series is not there,
> vgic-mapped-irqs.txt won't be available.

Good point, I'll update the queued patch.

>> + */
>> +if (vlr.state & LR_HW) {
>> +struct irq_phys_map *map;
>> +map = vgic_irq_map_search(vcpu, vlr.irq);
>> +
>> +ret = irq_set_irqchip_state(map->irq,
>> +IRQCHIP_STATE_ACTIVE,
>> +true);
> I understand the need for manually setting the phys dist state in case
> of timer however for non shared IRQs, GIC does the job directly. But I
> guess it does not harm.

For non-shared interrupts, I'd expect an additional test on map->shared
to avoid hitting the distributor once more.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] arm/arm64: KVM: vgic: Move active state handling to flush_hwstate

2015-09-07 Thread Eric Auger
On 09/07/2015 04:44 PM, Eric Auger wrote:
> Hi,
> On 09/04/2015 04:24 PM, Christoffer Dall wrote:
>> We currently set the physical active state only when we *inject* a new
>> pending virtual interrupt, but this is actually not correct, because we
>> could have been preempted and run something else on the system that
>> resets the active state to clear.  This causes us to run the VM with the
>> timer set to fire, but without setting the physical active state.
>>
>> The solution is to always check the LR configurations, and we if have a
>> mapped interrupt in the LR in either the pending or active state
>> (virtual), then set the physical active state.
>>
>> Acked-by: Marc Zyngier 
>> Signed-off-by: Christoffer Dall 
>> ---
>>  virt/kvm/arm/vgic.c | 42 ++
>>  1 file changed, 26 insertions(+), 16 deletions(-)
>>
>> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
>> index 9eb489a..6bd1c9b 100644
>> --- a/virt/kvm/arm/vgic.c
>> +++ b/virt/kvm/arm/vgic.c
>> @@ -1144,26 +1144,11 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu 
>> *vcpu, int irq,
>>  struct irq_phys_map *map;
>>  map = vgic_irq_map_search(vcpu, irq);
>>  
>> -/*
>> - * If we have a mapping, and the virtual interrupt is
>> - * being injected, then we must set the state to
>> - * active in the physical world. Otherwise the
>> - * physical interrupt will fire and the guest will
>> - * exit before processing the virtual interrupt.
>> - */
>>  if (map) {
>> -int ret;
>> -
>> -BUG_ON(!map->active);
> I have a question about this map->active. I did not find any user of
> kvm_vgic_set_phys_irq_active anymore. The flag is directly updated in
> vgic_sync_hwirq through the irq_get_irqchip_state query. In the same
> function, before there is a "BUG_ON(!map || !map->active);"
> 
> Can't we have this BUG_ON firing?
hum ignore that comment. Didn't see the call to
kvm_vgic_set_phys_irq_active in arch_timer.c

Eric
> 
> 
>>  vlr.hwirq = map->phys_irq;
>>  vlr.state |= LR_HW;
>>  vlr.state &= ~LR_EOI_INT;
>>  
>> -ret = irq_set_irqchip_state(map->irq,
>> -IRQCHIP_STATE_ACTIVE,
>> -true);
>> -WARN_ON(ret);
>> -
>>  /*
>>   * Make sure we're not going to sample this
>>   * again, as a HW-backed interrupt cannot be
>> @@ -1255,7 +1240,7 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu 
>> *vcpu)
>>  struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>  struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>  unsigned long *pa_percpu, *pa_shared;
>> -int i, vcpu_id;
>> +int i, vcpu_id, lr, ret;
>>  int overflow = 0;
>>  int nr_shared = vgic_nr_shared_irqs(dist);
>>  
>> @@ -1310,6 +1295,31 @@ epilog:
>>   */
>>  clear_bit(vcpu_id, dist->irq_pending_on_cpu);
>>  }
>> +
>> +for (lr = 0; lr < vgic->nr_lr; lr++) {
>> +struct vgic_lr vlr;
>> +
>> +if (!test_bit(lr, vgic_cpu->lr_used))
>> +continue;
>> +
>> +vlr = vgic_get_lr(vcpu, lr);
>> +
>> +/*
>> + * If we have a mapping, and the virtual interrupt is
>> + * presented to the guest (as pending or active), then we must
>> + * set the state to active in the physical world. See
>> + * Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt.
> if upstreamed in 4.3 whereas the other series is not there,
> vgic-mapped-irqs.txt won't be available.
>> + */
>> +if (vlr.state & LR_HW) {
>> +struct irq_phys_map *map;
>> +map = vgic_irq_map_search(vcpu, vlr.irq);
>> +
>> +ret = irq_set_irqchip_state(map->irq,
>> +IRQCHIP_STATE_ACTIVE,
>> +true);
> I understand the need for manually setting the phys dist state in case
> of timer however for non shared IRQs, GIC does the job directly. But I
> guess it does not harm.
> 
> Eric
>> +WARN_ON(ret);
>> +}
>> +}
>>  }
>>  
>>  static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
>>
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/8] arm/arm64: KVM: vgic: Factor out level irq processing on guest exit

2015-09-07 Thread Eric Auger


On 09/04/2015 09:40 PM, Christoffer Dall wrote:
> Currently vgic_process_maintenance() processes dealing with a completed
> level-triggered interrupt directly, but we are soon going to reuse this
> logic for level-triggered mapped interrupts with the HW bit set, so
> move this logic into a separate static function.
> 
> Probably the most scary part of this commit is convincing yourself that
> the current flow is safe compared to the old one.  In the following I
> try to list the changes and why they are harmless:
> 
>   Move vgic_irq_clear_queued after kvm_notify_acked_irq:
> Harmless because the effect of clearing the queued flag wrt.
> kvm_set_irq is only that vgic_update_irq_pending does not set the
> pending bit on the emulated CPU interface or in the pending_on_cpu
> bitmask,
well actually the notifier calls vgic_update_irq_pending with level ==0
so it does not reach the can_sample.
 but we set this in __kvm_vgic_sync_hwstate later on if the
> level is stil high.
still

Reviewed-by: Eric Auger 

Eric
> 
>   Move vgic_set_lr before kvm_notify_acked_irq:
> Also, harmless because the LR are cpu-local operations and
> kvm_notify_acked only affects the dist
> 
>   Move vgic_dist_irq_clear_soft_pend after kvm_notify_acked_irq:
> Also harmless because it's just a bit which is cleared and altering
> the line state does not affect this bit.
> 
> Reviewed-by: Marc Zyngier 
> Signed-off-by: Christoffer Dall 
> ---
>  virt/kvm/arm/vgic.c | 88 
> ++---
>  1 file changed, 50 insertions(+), 38 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 6bd1c9b..fe0e5db 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1322,12 +1322,56 @@ epilog:
>   }
>  }
>  
> +static int process_level_irq(struct kvm_vcpu *vcpu, int lr, struct vgic_lr 
> vlr)
> +{
> + int level_pending = 0;
> +
> + vlr.state = 0;
> + vlr.hwirq = 0;
> + vgic_set_lr(vcpu, lr, vlr);
> +
> + /*
> +  * If the IRQ was EOIed (called from vgic_process_maintenance) or it
> +  * went from active to non-active (called from vgic_sync_hwirq) it was
> +  * also ACKed and we we therefore assume we can clear the soft pending
> +  * state (should it had been set) for this interrupt.
> +  *
> +  * Note: if the IRQ soft pending state was set after the IRQ was
> +  * acked, it actually shouldn't be cleared, but we have no way of
> +  * knowing that unless we start trapping ACKs when the soft-pending
> +  * state is set.
> +  */
> + vgic_dist_irq_clear_soft_pend(vcpu, vlr.irq);
> +
> + /*
> +  * Tell the gic to start sampling the line of this interrupt again.
> +  */
> + vgic_irq_clear_queued(vcpu, vlr.irq);
> +
> + /* Any additional pending interrupt? */
> + if (vgic_dist_irq_get_level(vcpu, vlr.irq)) {
> + vgic_cpu_irq_set(vcpu, vlr.irq);
> + level_pending = 1;
> + } else {
> + vgic_dist_irq_clear_pending(vcpu, vlr.irq);
> + vgic_cpu_irq_clear(vcpu, vlr.irq);
> + }
> +
> + /*
> +  * Despite being EOIed, the LR may not have
> +  * been marked as empty.
> +  */
> + vgic_sync_lr_elrsr(vcpu, lr, vlr);
> +
> + return level_pending;
> +}
> +
>  static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
>  {
>   u32 status = vgic_get_interrupt_status(vcpu);
>   struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> - bool level_pending = false;
>   struct kvm *kvm = vcpu->kvm;
> + int level_pending = 0;
>  
>   kvm_debug("STATUS = %08x\n", status);
>  
> @@ -1342,54 +1386,22 @@ static bool vgic_process_maintenance(struct kvm_vcpu 
> *vcpu)
>  
>   for_each_set_bit(lr, eisr_ptr, vgic->nr_lr) {
>   struct vgic_lr vlr = vgic_get_lr(vcpu, lr);
> - WARN_ON(vgic_irq_is_edge(vcpu, vlr.irq));
>  
> - spin_lock(&dist->lock);
> - vgic_irq_clear_queued(vcpu, vlr.irq);
> + WARN_ON(vgic_irq_is_edge(vcpu, vlr.irq));
>   WARN_ON(vlr.state & LR_STATE_MASK);
> - vlr.state = 0;
> - vgic_set_lr(vcpu, lr, vlr);
>  
> - /*
> -  * If the IRQ was EOIed it was also ACKed and we we
> -  * therefore assume we can clear the soft pending
> -  * state (should it had been set) for this interrupt.
> -  *
> -  * Note: if the IRQ soft pending state was set after
> -  * the IRQ was acked, it actually shouldn't be
> -  * cleared, but we have no way of knowing that unless
> -  * we start trapping ACKs when the soft-pending state
> -  * is set.
> -  */
> - vgic_dist_irq_clear_soft_pend(vcpu, v

Re: [PATCH v2 2/8] arm/arm64: KVM: arch_timer: Only schedule soft timer on vcpu_block

2015-09-07 Thread Eric Auger
Hi Christoffer,
On 09/04/2015 09:40 PM, Christoffer Dall wrote:
> We currently schedule a soft timer every time we exit the guest if the
> timer did not expire while running the guest.  This is really not
> necessary, because the only work we do in the timer work function is to
> kick the vcpu.
> 
> Kicking the vcpu does two things:
> (1) If the vpcu thread is on a waitqueue, make it runnable and remove it
> from the waitqueue.
> (2) If the vcpu is running on a different physical CPU from the one
> doing the kick, it sends a reschedule IPI.
> 
> The second case cannot happen, because the soft timer is only ever
> scheduled when the vcpu is not running.  The first case is only relevant
> when the vcpu thread is on a waitqueue, which is only the case when the
> vcpu thread has called kvm_vcpu_block().
> 
> Therefore, we only need to make sure a timer is scheduled for
> kvm_vcpu_block(), which we do by encapsulating all calls to
> kvm_vcpu_block() with kvm_timer_{un}schedule calls.
> 
> Additionally, we only schedule a soft timer if the timer is enabled and
> unmasked, since it is useless otherwise.
> 
> Note that theoretically userspace can use the SET_ONE_REG interface to
> change registers that should cause the timer to fire, even if the vcpu
> is blocked without a scheduled timer, but this case was not supported
> before this patch and we leave it for future work for now.
> 
> Signed-off-by: Christoffer Dall 
> ---
>  arch/arm/include/asm/kvm_host.h   |  3 --
>  arch/arm/kvm/arm.c| 10 +
>  arch/arm64/include/asm/kvm_host.h |  3 --
>  include/kvm/arm_arch_timer.h  |  2 +
>  virt/kvm/arm/arch_timer.c | 91 
> ++-
>  5 files changed, 72 insertions(+), 37 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 86fcf6e..dcba0fa 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -236,7 +236,4 @@ static inline void kvm_arm_setup_debug(struct kvm_vcpu 
> *vcpu) {}
>  static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
>  
> -static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
> -static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
> -
>  #endif /* __ARM_KVM_HOST_H__ */
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index ce404a5..bdf8871 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -271,6 +271,16 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
>   return kvm_timer_should_fire(vcpu);
>  }
>  
> +void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
> +{
> + kvm_timer_schedule(vcpu);
> +}
> +
> +void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
> +{
> + kvm_timer_unschedule(vcpu);
> +}
> +
>  int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
>  {
>   /* Force users to call KVM_ARM_VCPU_INIT */
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index dd143f5..415938d 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -257,7 +257,4 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
>  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
>  void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
>  
> -static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
> -static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
> -
>  #endif /* __ARM64_KVM_HOST_H__ */
> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
> index e1e4d7c..ef14cc1 100644
> --- a/include/kvm/arm_arch_timer.h
> +++ b/include/kvm/arm_arch_timer.h
> @@ -71,5 +71,7 @@ u64 kvm_arm_timer_get_reg(struct kvm_vcpu *, u64 regid);
>  int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value);
>  
>  bool kvm_timer_should_fire(struct kvm_vcpu *vcpu);
> +void kvm_timer_schedule(struct kvm_vcpu *vcpu);
> +void kvm_timer_unschedule(struct kvm_vcpu *vcpu);
>  
>  #endif
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 48c6e1a..7991537 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -111,14 +111,21 @@ static enum hrtimer_restart kvm_timer_expire(struct 
> hrtimer *hrt)
>   return HRTIMER_NORESTART;
>  }
>  
> +static bool kvm_timer_irq_can_fire(struct kvm_vcpu *vcpu)
> +{
> + struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
> +
> + return !(timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) &&
> + (timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) &&
> + !kvm_vgic_get_phys_irq_active(timer->map);
kvm_vgic_get_phys_irq_active(timer->map) checks a logical state and not
the actual HW state. What is the exact aim of that check? in case the
PPI already is active, ie. timer hit, no use to schedule anything?

> +}
> +
>  bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
>  {
>   struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
>   

Re: [Qemu-devel] [kvm-unit-tests PATCH 2/2] arm/arm64 config: Fix arch_clean rule

2015-09-07 Thread Alexander Spyridakis
On 7 September 2015 at 16:37, Andrew Jones  wrote:
> +SHELL := /bin/bash

This indeed fixed the issue.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[kvm-unit-tests PATCH] makefiles: use bash

2015-09-07 Thread Andrew Jones
Use bash in the makefiles, like we do in the scripts.

Signed-off-by: Andrew Jones 
---
 Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Makefile b/Makefile
index 0d5933474cd8c..3e60b4f8e4a57 100644
--- a/Makefile
+++ b/Makefile
@@ -1,4 +1,6 @@
 
+SHELL := /bin/bash
+
 ifeq ($(wildcard config.mak),)
 $(error run ./configure first. See ./configure -h)
 endif
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] arm/arm64: KVM: vgic: Move active state handling to flush_hwstate

2015-09-07 Thread Eric Auger
Hi,
On 09/04/2015 04:24 PM, Christoffer Dall wrote:
> We currently set the physical active state only when we *inject* a new
> pending virtual interrupt, but this is actually not correct, because we
> could have been preempted and run something else on the system that
> resets the active state to clear.  This causes us to run the VM with the
> timer set to fire, but without setting the physical active state.
> 
> The solution is to always check the LR configurations, and we if have a
> mapped interrupt in the LR in either the pending or active state
> (virtual), then set the physical active state.
> 
> Acked-by: Marc Zyngier 
> Signed-off-by: Christoffer Dall 
> ---
>  virt/kvm/arm/vgic.c | 42 ++
>  1 file changed, 26 insertions(+), 16 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 9eb489a..6bd1c9b 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -1144,26 +1144,11 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu 
> *vcpu, int irq,
>   struct irq_phys_map *map;
>   map = vgic_irq_map_search(vcpu, irq);
>  
> - /*
> -  * If we have a mapping, and the virtual interrupt is
> -  * being injected, then we must set the state to
> -  * active in the physical world. Otherwise the
> -  * physical interrupt will fire and the guest will
> -  * exit before processing the virtual interrupt.
> -  */
>   if (map) {
> - int ret;
> -
> - BUG_ON(!map->active);
I have a question about this map->active. I did not find any user of
kvm_vgic_set_phys_irq_active anymore. The flag is directly updated in
vgic_sync_hwirq through the irq_get_irqchip_state query. In the same
function, before there is a "BUG_ON(!map || !map->active);"

Can't we have this BUG_ON firing?


>   vlr.hwirq = map->phys_irq;
>   vlr.state |= LR_HW;
>   vlr.state &= ~LR_EOI_INT;
>  
> - ret = irq_set_irqchip_state(map->irq,
> - IRQCHIP_STATE_ACTIVE,
> - true);
> - WARN_ON(ret);
> -
>   /*
>* Make sure we're not going to sample this
>* again, as a HW-backed interrupt cannot be
> @@ -1255,7 +1240,7 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu 
> *vcpu)
>   struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>   struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>   unsigned long *pa_percpu, *pa_shared;
> - int i, vcpu_id;
> + int i, vcpu_id, lr, ret;
>   int overflow = 0;
>   int nr_shared = vgic_nr_shared_irqs(dist);
>  
> @@ -1310,6 +1295,31 @@ epilog:
>*/
>   clear_bit(vcpu_id, dist->irq_pending_on_cpu);
>   }
> +
> + for (lr = 0; lr < vgic->nr_lr; lr++) {
> + struct vgic_lr vlr;
> +
> + if (!test_bit(lr, vgic_cpu->lr_used))
> + continue;
> +
> + vlr = vgic_get_lr(vcpu, lr);
> +
> + /*
> +  * If we have a mapping, and the virtual interrupt is
> +  * presented to the guest (as pending or active), then we must
> +  * set the state to active in the physical world. See
> +  * Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt.
if upstreamed in 4.3 whereas the other series is not there,
vgic-mapped-irqs.txt won't be available.
> +  */
> + if (vlr.state & LR_HW) {
> + struct irq_phys_map *map;
> + map = vgic_irq_map_search(vcpu, vlr.irq);
> +
> + ret = irq_set_irqchip_state(map->irq,
> + IRQCHIP_STATE_ACTIVE,
> + true);
I understand the need for manually setting the phys dist state in case
of timer however for non shared IRQs, GIC does the job directly. But I
guess it does not harm.

Eric
> + WARN_ON(ret);
> + }
> + }
>  }
>  
>  static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [kvm-unit-tests PATCH 2/2] arm/arm64 config: Fix arch_clean rule

2015-09-07 Thread Andrew Jones
On Mon, Sep 07, 2015 at 03:35:19PM +0200, Alexander Spyridakis wrote:
> On 4 September 2015 at 16:05, Andrew Jones  wrote:
> > This doesn't reproduce for me. I did the following, and it worked
> > fine.
> >
> > make distclean
> > ./configure --arch=arm --cross-prefix=arm-linux-gnu-
> > make
> > ./configure --arch=arm64 --cross-prefix=aarch64-linux-gnu-
> > make clean && make
> 
> Ok I think I found the issue:
> 
> config-arm-common.mak:
> >arm_clean: libfdt_clean asm_offsets_clean
> >$(RM) $(TEST_DIR)/*.{o,flat,elf} $(libeabi) $(eabiobjs) \
> >  $(TEST_DIR)/.*.d lib/arm/.*.d
> 
> config-x86-common.mak:
> >arch_clean:
> >$(RM) $(TEST_DIR)/*.o $(TEST_DIR)/*.flat $(TEST_DIR)/*.elf \
> >$(TEST_DIR)/.*.d lib/x86/.*.d
> 
> I think the arm case tries to be too clever and on many systems it
> fails (tested on debian:jessie,sid and ubuntu:14.04,15.04). Basically
> the expression for the arm case fails to resolve, while using the
> simpler x86 way works as expected.
> 
> So is the following change acceptable in config-arm-common.mak?
> > arm_clean: libfdt_clean asm_offsets_clean
> >-   $(RM) $(TEST_DIR)/*.{o,flat,elf} $(libeabi) $(eabiobjs) \
> >- $(TEST_DIR)/.*.d lib/arm/.*.d
> >+   $(RM) $(TEST_DIR)/*.o $(TEST_DIR)/*.flat $(TEST_DIR)/*.elf \
> >+   $(libeabi) $(eabiobjs) $(TEST_DIR)/.*.d lib/arm/.*.d

Ah, it's a dash vs. bash thing. Either we need to change all bashisms
in the makefiles, or, since kvm-unit-tests already depends on bash for
its scripts, then we might as well just tell make to use it too.  This
patch will fix it

diff --git a/Makefile b/Makefile
index 0d5933474cd8c..3e60b4f8e4a57 100644
--- a/Makefile
+++ b/Makefile
@@ -1,4 +1,6 @@
 
+SHELL := /bin/bash
+
 ifeq ($(wildcard config.mak),)
 $(error run ./configure first. See ./configure -h)
 endif

I'll probably submit the patch in a second.

Thanks for hunting down the problem!

drew
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-07 Thread Igor Mammedov
On Fri, 4 Sep 2015 12:04:41 +0200
Alexander Graf  wrote:

> 
> 
> On 04.09.15 11:59, Christian Borntraeger wrote:
> > Am 04.09.2015 um 11:35 schrieb Thomas Huth:
> >>
> >>  Hi all,
> >>
> >> now that we get memory hotplugging for the spapr machine on qemu-ppc,
> >> too, it seems like we easily can hit the amount of KVM-internal memory
> >> slots now ("#define KVM_USER_MEM_SLOTS 32" in
> >> arch/powerpc/include/asm/kvm_host.h). For example, start
> >> qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
> >> 4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
> >> you'll see that it aborts way earlier already.
> >>
> >> The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
> >> already (+3 internal slots = 512) ... maybe we should now increase the
> >> amount of slots on powerpc, too? Since we don't use internal slots on
> >> POWER, would 512 be a good value? Or would less be sufficient, too?
> > 
> > When you are at it, the s390 value should also be increased I guess.
> 
> That constant defines the array size for the memslot array in struct kvm
> which in turn again gets allocated by kzalloc, so it's pinned kernel
> memory that is physically contiguous. Doing big allocations can turn
> into problems during runtime.
> 
> So maybe there is another way? Can we extend the memslot array size
> dynamically somehow? Allocate it separately? How much memory does the
> memslot array use up with 512 entries?

KVM switched memslots allocation to kvm_kvzalloc(), so it would fallback to 
vmalloc
 commit 744961341d472db6272ed9b42319a90f5a2aa7c4
 kvm: avoid page allocation failure in kvm_set_memory_region()

> 
> Alex
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v2 08/18] nvdimm: init backend memory mapping and config data area

2015-09-07 Thread Igor Mammedov
On Fri, 14 Aug 2015 22:52:01 +0800
Xiao Guangrong  wrote:

> The parameter @file is used as backed memory for NVDIMM which is
> divided into two parts if @dataconfig is true:
> - first parts is (0, size - 128K], which is used as PMEM (Persistent
>   Memory)
> - 128K at the end of the file, which is used as Config Data Area, it's
>   used to store Label namespace data
> 
> The @file supports both regular file and block device, of course we
> can assign any these two kinds of files for test and emulation, however,
> in the real word for performance reason, we usually used these files as
> NVDIMM backed file:
> - the regular file in the filesystem with DAX enabled created on NVDIMM
>   device on host
> - the raw PMEM device on host, e,g /dev/pmem0

A lot of code in this series could reuse what QEMU already
uses for implementing pc-dimm devices.

here is common concepts that could be reused.
  - on physical system both DIMM and NVDIMM devices use
the same slots. We could share QEMU's '-m slots' option between
both devices. An alternative to not sharing would be to introduce
'-machine nvdimm_slots' option.
And yes, we need to know number of NVDIMMs to describe
them all in ACPI table (taking in amount future hotplug
include in this possible NVDIMM devices)
I'd go the same way as on real hardware on make them share the same slots.
  - they share the same physical address space and limits
on how much memory system can handle. So I'd suggest sharing existing
'-m maxmem' option and reuse hotplug_memory address space.

Essentially what I'm suggesting is to inherit NVDIMM's implementation
from pc-dimm reusing all of its code/backends and
just override parts that do memory mapping into guest's address space to
accommodate NVDIMM's requirements.

> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/mem/nvdimm/pc-nvdimm.c  | 109 
> -
>  include/hw/mem/pc-nvdimm.h |   7 +++
>  2 files changed, 115 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/mem/nvdimm/pc-nvdimm.c b/hw/mem/nvdimm/pc-nvdimm.c
> index 7a270a8..97710d1 100644
> --- a/hw/mem/nvdimm/pc-nvdimm.c
> +++ b/hw/mem/nvdimm/pc-nvdimm.c
> @@ -22,12 +22,20 @@
>   * License along with this library; if not, see 
> 
>   */
>  
> +#include 
> +#include 
> +#include 
> +
> +#include "exec/address-spaces.h"
>  #include "hw/mem/pc-nvdimm.h"
>  
> -#define PAGE_SIZE  (1UL << 12)
> +#define PAGE_SIZE   (1UL << 12)
> +
> +#define MIN_CONFIG_DATA_SIZE(128 << 10)
>  
>  static struct nvdimms_info {
>  ram_addr_t current_addr;
> +int device_index;
>  } nvdimms_info;
>  
>  /* the address range [offset, ~0ULL) is reserved for NVDIMM. */
> @@ -37,6 +45,26 @@ void pc_nvdimm_reserve_range(ram_addr_t offset)
>  nvdimms_info.current_addr = offset;
>  }
>  
> +static ram_addr_t reserved_range_push(uint64_t size)
> +{
> +uint64_t current;
> +
> +current = ROUND_UP(nvdimms_info.current_addr, PAGE_SIZE);
> +
> +/* do not have enough space? */
> +if (current + size < current) {
> +return 0;
> +}
> +
> +nvdimms_info.current_addr = current + size;
> +return current;
> +}
You can't use all memory above hotplug_memory area since
we have to tell guest where 64-bit PCI window starts,
and currently it should start at reserved-memory-end
(but it isn't due to a bug: I've just posted fix to qemu-devel
 "[PATCH 0/2] pc: fix 64-bit PCI window clashing with memory hotplug region"
)

> +
> +static uint32_t new_device_index(void)
> +{
> +return nvdimms_info.device_index++;
> +}
> +
>  static char *get_file(Object *obj, Error **errp)
>  {
>  PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj);
> @@ -48,6 +76,11 @@ static void set_file(Object *obj, const char *str, Error 
> **errp)
>  {
>  PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj);
>  
> +if (memory_region_size(&nvdimm->mr)) {
> +error_setg(errp, "cannot change property value");
> +return;
> +}
> +
>  if (nvdimm->file) {
>  g_free(nvdimm->file);
>  }
> @@ -76,13 +109,87 @@ static void pc_nvdimm_init(Object *obj)
>   set_configdata, NULL);
>  }
>  
> +static uint64_t get_file_size(int fd)
> +{
> +struct stat stat_buf;
> +uint64_t size;
> +
> +if (fstat(fd, &stat_buf) < 0) {
> +return 0;
> +}
> +
> +if (S_ISREG(stat_buf.st_mode)) {
> +return stat_buf.st_size;
> +}
> +
> +if (S_ISBLK(stat_buf.st_mode) && !ioctl(fd, BLKGETSIZE64, &size)) {
> +return size;
> +}
> +
> +return 0;
> +}
All this file stuff I'd leave to already existing backends like
memory-backend-file or even memory-backend-ram which already do
above and more allowing to configure persistent and volatile
NVDIMMs without changing NVDIMM fronted code.

> +
>  static void pc_nvdimm_realize(DeviceState *dev, Error **errp)
>  {
>  PCNVDIMMDevice *nvdimm = PC_NVDIMM(dev);
> +char name[512];
> +voi

Re: [Qemu-devel] [PATCH v2 06/18] pc: implement NVDIMM device abstract

2015-09-07 Thread Igor Mammedov
On Sun, 6 Sep 2015 14:07:21 +0800
Xiao Guangrong  wrote:

> 
> 
> On 09/02/2015 07:31 PM, Igor Mammedov wrote:
> > On Wed, 2 Sep 2015 18:36:43 +0800
> > Xiao Guangrong  wrote:
> >
> >>
> >>
> >> On 09/02/2015 05:58 PM, Igor Mammedov wrote:
> >>> On Fri, 14 Aug 2015 22:51:59 +0800
> >>> Xiao Guangrong  wrote:
> >>>
>  Introduce "pc-nvdimm" device and it has two parameters:
> >>> Why do you use prefix "pc-", I suppose we potentially
> >>> could use this device not only with x86 targets but with
> >>> other targets as well.
> >>> I'd just drop 'pc' prefix through out patchset.
> >>
> >> Yeah, the prefix is stolen from pc-dimm, will drop this
> >> prefix as your suggestion.
> >>
> >>>
>  - @file, which is the backed memory file for NVDIMM device
> >>> Could you try to split device into backend/frontend parts,
> >>> like it's done with pc-dimm. As I understand it's preferred
> >>> way to implement this kind of devices.
> >>> Then you could reuse memory backends that we already have
> >>> including file backend.
> >>
> >> I considered it too and Stefan, Paolo got the some idea in
> >> V1's review, however:
> >>
> >> | However, file-based memory used by NVDIMM is special, it divides the file
> >> | to two parts, one part is used as PMEM and another part is used to store
> >> | NVDIMM's configure data.
> >> |
> >> | Maybe we can introduce "end-reserved" property to reserve specified size
> >> | at the end of the file. Or create a new class type based on
> >> | memory-backend-file (named nvdimm-backend-file) class to hide this magic
> >> | thing?
> > I'd go with separate backend/frontend idea.
> >
> > Question is if this config area is part backend or frontend?
> 
> Configdata area is used to store nvdimm device's configuration, normally, it's
> namespace info.
> 
> Currently, we chosen configdata located at the end of nvdimm's backend-memory
> as it's easy to configure / use and configdata is naturally non-volatile and 
> it
> is like the layout on physical device.
> 
> However, using two separated backed-memory is okay, for example:
> -object memory-backend-file,id=mem0,file=/storage/foo
> -object memory-backend-file,id=mem1,file=/storage/bar
> -device nvdimm,memdev=mem0,configdata=mem1
> then configdata is written to a single backend.
> 
> Which one is better for you? :)
> 
> > If we pass-through NVDIMM device do we need to set configdata=true
> > and QEMU would skip building config structures and use structures
> > that are already present on passed-through device in that place?
> >
> 
> The file specified by @file is something like a normal disk, like /dev/sda/,
> host process can use whole space on it. If we want to directly pass it to 
> guest,
> we can specify 'configdata=false'. If we allow guest to 'partition' (create
> namespace on) it then we use 'configdata=true' to reserve some space to store
> its partition info (namesapce info).
As far as I understand currently linux provides to userspace only one interface
which is block device i.e. /dev/sdX and on top of it userspace can put
PM/DAX aware filesystem and use files from it. In either cases kernel
just provides access to separate namespaces and not to a whole NVDIMM which
includes 'labels area'. Hence /dev/sdX is not passed-though NVDIMM,
so we could consider it as just a file/storage that could be used by userspace.

Lets assume that NVDIMM should always have 'labels area'.
In that case I'd always reserve space for it and
 * format it (build a new one) if backend doesn't have a
   valid labels area dropping configdata parameter along the way
 * or if backing-file already has valid labels area I'd just use it.

If you need to make labels area readonly you can introduce 
'NVDIMM.readonly_labels'
option and just use labels backend's without allowing changes writeback.
IT would be better to make it another series on top of basic NVDIMM 
implementation
if there is an actual usecase for it.

PS:
Also when you write commit messages, comment and name variables try to use 
terms from
relevant spec and mention specs where you describe data structures from them.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [kvm-unit-tests PATCH 2/2] arm/arm64 config: Fix arch_clean rule

2015-09-07 Thread Alexander Spyridakis
On 4 September 2015 at 16:05, Andrew Jones  wrote:
> This doesn't reproduce for me. I did the following, and it worked
> fine.
>
> make distclean
> ./configure --arch=arm --cross-prefix=arm-linux-gnu-
> make
> ./configure --arch=arm64 --cross-prefix=aarch64-linux-gnu-
> make clean && make

Ok I think I found the issue:

config-arm-common.mak:
>arm_clean: libfdt_clean asm_offsets_clean
>$(RM) $(TEST_DIR)/*.{o,flat,elf} $(libeabi) $(eabiobjs) \
>  $(TEST_DIR)/.*.d lib/arm/.*.d

config-x86-common.mak:
>arch_clean:
>$(RM) $(TEST_DIR)/*.o $(TEST_DIR)/*.flat $(TEST_DIR)/*.elf \
>$(TEST_DIR)/.*.d lib/x86/.*.d

I think the arm case tries to be too clever and on many systems it
fails (tested on debian:jessie,sid and ubuntu:14.04,15.04). Basically
the expression for the arm case fails to resolve, while using the
simpler x86 way works as expected.

So is the following change acceptable in config-arm-common.mak?
> arm_clean: libfdt_clean asm_offsets_clean
>-   $(RM) $(TEST_DIR)/*.{o,flat,elf} $(libeabi) $(eabiobjs) \
>- $(TEST_DIR)/.*.d lib/arm/.*.d
>+   $(RM) $(TEST_DIR)/*.o $(TEST_DIR)/*.flat $(TEST_DIR)/*.elf \
>+   $(libeabi) $(eabiobjs) $(TEST_DIR)/.*.d lib/arm/.*.d

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/5] KVM: optimize userspace exits with a new ioctl

2015-09-07 Thread Paolo Bonzini


On 02/09/2015 12:31, Christian Borntraeger wrote:
> As far as I can see this should also work for s390 (when we implement 
> REQ_EXIT handling)
> 
> To double check my understanding: these improvements come with a changed
> userspace that does not use KVM_SET_SIGNAL_MASK and therefore this
> conditional is false
>   if (vcpu->sigset_active)
>   sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
> Correct?
> 
> That also means we improve exits that go to userspace but lightweight exits
> that stay inside the kernel are not affected. 

Yes, both observations are correct.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 答复: I'm now looking into kvm-unit-tests and encounted with some problems.

2015-09-07 Thread Paolo Bonzini


On 02/09/2015 05:33, Guoyanjuan wrote:
> Hi,  I found my code is old and I git the latest code from
> https://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git,
> some problems are solved but one.
> 
> when I run emulate unittest, it failed.
> 
> command:
> qemu-kvm --enable-kvm -device pc-testdev -device
> isa-debug-exit,iobase=0xf4,iosize=0x4 -serial stdio -device
> pci-testdev -kernel x86/emulator.flat -vnc none
> 
> logs:
> FAIL: mov null, %ss

Lucas Meneghel Rodrigues also reproduced this, it seems to be processor
dependent.  I haven't debugged it yet because it doesn't reproduce on
the two systems I've tested on (Ivy Bridge i7 and Haswell Xeon E5).

Can you please also try loading the kvm-intel module with the
"unrestricted_guest=0" parameter, and see if it also reproduce?

It might be a processor bug too.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] KVM: nVMX: VMWRITE emulation: remove unnecessary check for compatibility mode

2015-09-07 Thread Paolo Bonzini


On 20/08/2015 21:38, Eugene Korenevsky wrote:
> VMWRITE instruction is not valid in compatibility mode. This is
> checked by nested_vmx_check_permission() function which throws #UD if CS.L=0.
> The additional check in is_64_bit_mode() for CS.L=0 is useless.

This is true, and it matches what handle_vmread does, on the other hand
is_long_mode is generally used to test page table type.

I think it's clearer if you change handle_vmread to use is_64_bit_mode
instead.

Paolo

> We should check only EFER.LMA=1 which is done by is_long_mode().
> 
> Signed-off-by: Eugene Korenevsky 
> ---
>  arch/x86/kvm/vmx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index f39e24f..12bdaae 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -7034,7 +7034,7 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
>   field_value = kvm_register_readl(vcpu,
>   (((vmx_instruction_info) >> 3) & 0xf));
>   else {
> - mem_op_size = is_64_bit_mode(vcpu) ? 8 : 4;
> + mem_op_size = is_long_mode(vcpu) ? 8 : 4;
>   if (get_vmx_mem_address(vcpu, exit_qualification,
>   vmx_instruction_info, false, mem_op_size, &gva))
>   return 1;
> -- 2.1.4
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/5] KVM: nVMX: fix limit checking: memory operand size varies for different VMX instructions

2015-09-07 Thread Paolo Bonzini


On 20/08/2015 21:37, Eugene Korenevsky wrote:
> When checking limits for VMX opcodes in protected mode, different sizes of
> memory operands must be taken into account.
> For VMREAD and VMWRITE instructions, memory operand size is 32 or 64 bits
> depending on CPU mode. For VMON, VMCLEAR, VMPTRST, VMPTRLD instructions,
> memory operand size is 64 bits. For INVEPT instruction, memory operand size
> is 128 bits.
> 
> Signed-off-by: Eugene Korenevsky 

Looks good, but please provide unit tests in kvm-unit-tests too.

Paolo

> ---
>  arch/x86/kvm/vmx.c | 21 +
>  1 file changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 4a4d677..f39e24f 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -6399,7 +6399,8 @@ static int vmx_protmode_seg_check(struct kvm_vcpu *vcpu,
>   */
>  static int get_vmx_mem_address(struct kvm_vcpu *vcpu,
>unsigned long exit_qualification,
> -  u32 vmx_instruction_info, bool wr, gva_t *ret)
> +  u32 vmx_instruction_info,
> +  bool wr, int mem_op_size, gva_t *ret)
>  {
>   gva_t off;
>   struct kvm_segment s;
> @@ -6466,7 +6467,7 @@ static int nested_vmx_check_vmptr(struct kvm_vcpu 
> *vcpu, int exit_reason,
>   int maxphyaddr = cpuid_maxphyaddr(vcpu);
>  
>   if (get_vmx_mem_address(vcpu, vmcs_readl(EXIT_QUALIFICATION),
> - vmcs_read32(VMX_INSTRUCTION_INFO), false, &gva))
> + vmcs_read32(VMX_INSTRUCTION_INFO), false, 8, &gva))
>   return 1;
>  
>   if (kvm_read_guest_virt(&vcpu->arch.emulate_ctxt, gva, &vmptr,
> @@ -6971,6 +6972,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
>   unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
>   u32 vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
>   gva_t gva = 0;
> + int mem_op_size;
>  
>   if (!nested_vmx_check_permission(vcpu) ||
>   !nested_vmx_check_vmcs12(vcpu))
> @@ -6993,12 +6995,13 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
>   kvm_register_writel(vcpu, (((vmx_instruction_info) >> 3) & 0xf),
>   field_value);
>   } else {
> + mem_op_size = is_long_mode(vcpu) ? 8 : 4;
>   if (get_vmx_mem_address(vcpu, exit_qualification,
> - vmx_instruction_info, true, &gva))
> + vmx_instruction_info, true, mem_op_size, &gva))
>   return 1;
>   /* _system ok, as nested_vmx_check_permission verified cpl=0 */
>   kvm_write_guest_virt_system(&vcpu->arch.emulate_ctxt, gva,
> -  &field_value, (is_long_mode(vcpu) ? 8 : 4), NULL);
> +  &field_value, mem_op_size, NULL);
>   }
>  
>   nested_vmx_succeed(vcpu);
> @@ -7021,6 +7024,7 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
>*/
>   u64 field_value = 0;
>   struct x86_exception e;
> + int mem_op_size;
>  
>   if (!nested_vmx_check_permission(vcpu) ||
>   !nested_vmx_check_vmcs12(vcpu))
> @@ -7030,11 +7034,12 @@ static int handle_vmwrite(struct kvm_vcpu *vcpu)
>   field_value = kvm_register_readl(vcpu,
>   (((vmx_instruction_info) >> 3) & 0xf));
>   else {
> + mem_op_size = is_64_bit_mode(vcpu) ? 8 : 4;
>   if (get_vmx_mem_address(vcpu, exit_qualification,
> - vmx_instruction_info, false, &gva))
> + vmx_instruction_info, false, mem_op_size, &gva))
>   return 1;
>   if (kvm_read_guest_virt(&vcpu->arch.emulate_ctxt, gva,
> -&field_value, (is_64_bit_mode(vcpu) ? 8 : 4), &e)) {
> +&field_value, mem_op_size, &e)) {
>   kvm_inject_page_fault(vcpu, &e);
>   return 1;
>   }
> @@ -7123,7 +7128,7 @@ static int handle_vmptrst(struct kvm_vcpu *vcpu)
>   return 1;
>  
>   if (get_vmx_mem_address(vcpu, exit_qualification,
> - vmx_instruction_info, true, &vmcs_gva))
> + vmx_instruction_info, true, 8, &vmcs_gva))
>   return 1;
>   /* ok to use *_system, as nested_vmx_check_permission verified cpl=0 */
>   if (kvm_write_guest_virt_system(&vcpu->arch.emulate_ctxt, vmcs_gva,
> @@ -7179,7 +7184,7 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>* operand is read even if it isn't needed (e.g., for type==global)
>*/
>   if (get_vmx_mem_address(vcpu, vmcs_readl(EXIT_QUALIFICATION),
> - vmx_instruction_info, false, &gva))
> + vmx_instruction_info, false, 16, &gva))
>   return 1;
>   if (kvm_read_guest_virt(&vcpu->arch.emulate_ctxt, gva, &operand,
>  

Re: [PATCH 3/5] KVM: nVMX: add limit check for expand-down segments

2015-09-07 Thread Paolo Bonzini


On 20/08/2015 21:37, Eugene Korenevsky wrote:
> Add limit checking for expand-down data segments. For such segments, the
> effective limit specifies the last address that is not allowed to be accessed
> within the segment. I.e. offset <= limit means means limit exceeding.
> 
> Signed-off-by: Eugene Korenevsky 
> ---
>  arch/x86/kvm/vmx.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index faa05a4..4a4d677 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -6377,7 +6377,10 @@ static int vmx_protmode_seg_check(struct kvm_vcpu 
> *vcpu,
>   /* #GP(0)/#SS(0) if the segment is unusable. */
>   exn = (s->unusable != 0);
>   /* #GP(0)/#SS(0) if the memory operand is outside the segment limit. */
> - exn = exn || (off + mem_op_size - 1 > s->limit);
> + if (!(s->type & 8) && (s->type & 4)) /* expand-down segment? */
> + exn = exn || (off <= s->limit);
> + else
> + exn = exn || (off + mem_op_size - 1 > s->limit);

For expand-down segments you also have to check against the size of the
segment (48 bits for 64-bit segments and positive addresses, 32 bits for
32-bit segments, 16 bits for 16-bit segments; see __linearize in emulate.c).

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] KVM: nVMX: refactor segment checks, make the code more clean and straightforward

2015-09-07 Thread Paolo Bonzini


On 20/08/2015 21:36, Eugene Korenevsky wrote:
> Prepare for subsequent changes. Extract calls for segment checking in 
> protected
> and 64-bit mode. This should be done to avoid overbloating of
> get_vmx_mem_address() function, even if kvm_queue_exception_e() is called
> twice.

The idea behind the patch is okay, but all VMX root-mode instructions
require CR0.PE = 1, so vmx_protmode_seg_check must always be run.
Please adjust nested_vmx_check_permission to check CR0.PE as well.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/9] KVM: VMX: simplify invpcid handling in vmx_cpuid_update()

2015-09-07 Thread Paolo Bonzini


On 21/08/2015 06:50, Xiao Guangrong wrote:
> + if (vmx_invpcid_supported() && (!best ||

Please start the "(" subexpression on a new line.

Paolo

> + !(best->ebx & bit(X86_FEATURE_INVPCID)) ||
> + !guest_cpuid_has_pcid(vcpu))) {
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/9] KVM: VMX: introduce set_clear_2nd_exec_ctrl()

2015-09-07 Thread Paolo Bonzini


On 21/08/2015 06:50, Xiao Guangrong wrote:
>  
> +static void set_clear_2nd_exec_ctrl(u32 ctrls, bool set)
> +{
> + u32 exec_ctrl = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
> +
> + if (set)
> + exec_ctrl |= ctrls;
> + else
> + exec_ctrl &= ~ctrls;
> +
> + vmcs_write32(SECONDARY_VM_EXEC_CONTROL, exec_ctrl);
> +}

The second argument is always true.  Do you have any plans for it?

Should we instead add functions like vmcs_or32 and vmcs_clear32?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 6/8] arm/arm64: KVM: Add forwarded physical interrupts documentation

2015-09-07 Thread Andre Przywara
Hi,

firstly: this text is really great, thanks for coming up with that.
See below for some information I got from tracing the host which I
cannot make sense of


On 04/09/15 20:40, Christoffer Dall wrote:
> Forwarded physical interrupts on arm/arm64 is a tricky concept and the
> way we deal with them is not apparently easy to understand by reading
> various specs.
> 
> Therefore, add a proper documentation file explaining the flow and
> rationale of the behavior of the vgic.
> 
> Some of this text was contributed by Marc Zyngier and edited by me.
> Omissions and errors are all mine.
> 
> Signed-off-by: Christoffer Dall 
> ---
>  Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 181 
> +
>  1 file changed, 181 insertions(+)
>  create mode 100644 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> 
> diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt 
> b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> new file mode 100644
> index 000..24b6f28
> --- /dev/null
> +++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> @@ -0,0 +1,181 @@
> +KVM/ARM VGIC Forwarded Physical Interrupts
> +==
> +
> +The KVM/ARM code implements software support for the ARM Generic
> +Interrupt Controller's (GIC's) hardware support for virtualization by
> +allowing software to inject virtual interrupts to a VM, which the guest
> +OS sees as regular interrupts.  The code is famously known as the VGIC.
> +
> +Some of these virtual interrupts, however, correspond to physical
> +interrupts from real physical devices.  One example could be the
> +architected timer, which itself supports virtualization, and therefore
> +lets a guest OS program the hardware device directly to raise an
> +interrupt at some point in time.  When such an interrupt is raised, the
> +host OS initially handles the interrupt and must somehow signal this
> +event as a virtual interrupt to the guest.  Another example could be a
> +passthrough device, where the physical interrupts are initially handled
> +by the host, but the device driver for the device lives in the guest OS
> +and KVM must therefore somehow inject a virtual interrupt on behalf of
> +the physical one to the guest OS.
> +
> +These virtual interrupts corresponding to a physical interrupt on the
> +host are called forwarded physical interrupts, but are also sometimes
> +referred to as 'virtualized physical interrupts' and 'mapped interrupts'.
> +
> +Forwarded physical interrupts are handled slightly differently compared
> +to virtual interrupts generated purely by a software emulated device.
> +
> +
> +The HW bit
> +--
> +Virtual interrupts are signalled to the guest by programming the List
> +Registers (LRs) on the GIC before running a VCPU.  The LR is programmed
> +with the virtual IRQ number and the state of the interrupt (Pending,
> +Active, or Pending+Active).  When the guest ACKs and EOIs a virtual
> +interrupt, the LR state moves from Pending to Active, and finally to
> +inactive.
> +
> +The LRs include an extra bit, called the HW bit.  When this bit is set,
> +KVM must also program an additional field in the LR, the physical IRQ
> +number, to link the virtual with the physical IRQ.
> +
> +When the HW bit is set, KVM must EITHER set the Pending OR the Active
> +bit, never both at the same time.
> +
> +Setting the HW bit causes the hardware to deactivate the physical
> +interrupt on the physical distributor when the guest deactivates the
> +corresponding virtual interrupt.
> +
> +
> +Forwarded Physical Interrupts Life Cycle
> +
> +
> +The state of forwarded physical interrupts is managed in the following way:
> +
> +  - The physical interrupt is acked by the host, and becomes active on
> +the physical distributor (*).
> +  - KVM sets the LR.Pending bit, because this is the only way the GICV
> +interface is going to present it to the guest.
> +  - LR.Pending will stay set as long as the guest has not acked the 
> interrupt.
> +  - LR.Pending transitions to LR.Active on the guest read of the IAR, as
> +expected.
> +  - On guest EOI, the *physical distributor* active bit gets cleared,
> +but the LR.Active is left untouched (set).

I tried hard in the last week, but couldn't confirm this. Tracing shows
the following pattern over and over (case 1):
(This is the kvm/kvm.git:queue branch from last week, so including the
mapped timer IRQ code. Tests were done on Juno and Midway)

...
229.340171: kvm_exit: TRAP: HSR_EC: 0x0001 (WFx), PC: 0xffc98a64
229.340324: kvm_exit: IRQ: HSR_EC: 0x0001 (WFx), PC: 0xffc0001c63a0
229.340428: kvm_exit: TRAP: HSR_EC: 0x0024 (DABT_LOW), PC:
0xffc0004089d8
229.340430: kvm_vgic_sync_hwstate: LR0 vIRQ: 27, HWIRQ: 27, LR.state: 8,
ELRSR: 1, dist active: 0, log. active: 1


My hunch is that the following happens (please correct me if needed!):
First there is an unrelated trap (line 1), then later the gues

Re: [PATCH 3/9] KVM: x86: add pcommit support

2015-09-07 Thread Paolo Bonzini


On 21/08/2015 06:50, Xiao Guangrong wrote:
> Pass PCOMMIT CPU feature to guest to enable PCOMMIT instruction
> 
> Currently we do not catch pcommit instruction for L1 guest and
> allow L1 to catch this instruction for L2
> 
> The specification locates at:
> https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  arch/x86/include/asm/vmx.h  |  2 +-
>  arch/x86/include/uapi/asm/vmx.h |  4 +++-
>  arch/x86/kvm/cpuid.c|  2 +-
>  arch/x86/kvm/cpuid.h|  8 
>  arch/x86/kvm/vmx.c  | 29 -
>  5 files changed, 37 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
> index 9299ae5..e2ad08a 100644
> --- a/arch/x86/include/asm/vmx.h
> +++ b/arch/x86/include/asm/vmx.h
> @@ -72,7 +72,7 @@
>  #define SECONDARY_EXEC_SHADOW_VMCS  0x4000
>  #define SECONDARY_EXEC_ENABLE_PML   0x0002
>  #define SECONDARY_EXEC_XSAVES0x0010
> -
> +#define SECONDARY_EXEC_PCOMMIT   0x0020
>  
>  #define PIN_BASED_EXT_INTR_MASK 0x0001
>  #define PIN_BASED_NMI_EXITING   0x0008
> diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h
> index 37fee27..5b15d94 100644
> --- a/arch/x86/include/uapi/asm/vmx.h
> +++ b/arch/x86/include/uapi/asm/vmx.h
> @@ -78,6 +78,7 @@
>  #define EXIT_REASON_PML_FULL62
>  #define EXIT_REASON_XSAVES  63
>  #define EXIT_REASON_XRSTORS 64
> +#define EXIT_REASON_PCOMMIT 65
>  
>  #define VMX_EXIT_REASONS \
>   { EXIT_REASON_EXCEPTION_NMI, "EXCEPTION_NMI" }, \
> @@ -126,7 +127,8 @@
>   { EXIT_REASON_INVVPID,   "INVVPID" }, \
>   { EXIT_REASON_INVPCID,   "INVPCID" }, \
>   { EXIT_REASON_XSAVES,"XSAVES" }, \
> - { EXIT_REASON_XRSTORS,   "XRSTORS" }
> + { EXIT_REASON_XRSTORS,   "XRSTORS" }, \
> + { EXIT_REASON_PCOMMIT,   "PCOMMIT" }
>  
>  #define VMX_ABORT_SAVE_GUEST_MSR_FAIL1
>  #define VMX_ABORT_LOAD_HOST_MSR_FAIL 4
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 962fc7d..faeb0b3 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -348,7 +348,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 
> *entry, u32 function,
>   F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) |
>   F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | f_mpx | F(RDSEED) |
>   F(ADX) | F(SMAP) | F(AVX512F) | F(AVX512PF) | F(AVX512ER) |
> - F(AVX512CD) | F(CLFLUSHOPT) | F(CLWB);
> + F(AVX512CD) | F(CLFLUSHOPT) | F(CLWB) | F(PCOMMIT);
>  
>   /* cpuid 0xD.1.eax */
>   const u32 kvm_supported_word10_x86_features =
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index dd05b9c..aed7bfe 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -133,4 +133,12 @@ static inline bool guest_cpuid_has_mpx(struct kvm_vcpu 
> *vcpu)
>   best = kvm_find_cpuid_entry(vcpu, 7, 0);
>   return best && (best->ebx & bit(X86_FEATURE_MPX));
>  }
> +
> +static inline bool guest_cpuid_has_pcommit(struct kvm_vcpu *vcpu)
> +{
> + struct kvm_cpuid_entry2 *best;
> +
> + best = kvm_find_cpuid_entry(vcpu, 7, 0);
> + return best && (best->ebx & bit(X86_FEATURE_PCOMMIT));
> +}
>  #endif
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 4cf25b9..b526c61 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2474,7 +2474,8 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx 
> *vmx)
>   SECONDARY_EXEC_APIC_REGISTER_VIRT |
>   SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
>   SECONDARY_EXEC_WBINVD_EXITING |
> - SECONDARY_EXEC_XSAVES;
> + SECONDARY_EXEC_XSAVES |
> + SECONDARY_EXEC_PCOMMIT;
>  
>   if (enable_ept) {
>   /* nested EPT: emulate EPT also to L1 */
> @@ -3015,7 +3016,8 @@ static __init int setup_vmcs_config(struct vmcs_config 
> *vmcs_conf)
>   SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
>   SECONDARY_EXEC_SHADOW_VMCS |
>   SECONDARY_EXEC_XSAVES |
> - SECONDARY_EXEC_ENABLE_PML;
> + SECONDARY_EXEC_ENABLE_PML |
> + SECONDARY_EXEC_PCOMMIT;
>   if (adjust_vmx_controls(min2, opt2,
>   MSR_IA32_VMX_PROCBASED_CTLS2,
>   &_cpu_based_2nd_exec_control) < 0)
> @@ -4570,6 +4572,9 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx 
> *vmx)
>   /* PML is enabled/disabled in creating/destorying vcpu */
>   exec_control &= ~SECONDARY_EXEC_ENABLE_PML;
>  
> + /* Currently, we allow L1 guest to directly run pcommit ins

Re: [PATCH] vfio: Enable VFIO device for powerpc

2015-09-07 Thread Paolo Bonzini


On 26/08/2015 20:54, Paul Mackerras wrote:
> On Wed, Aug 26, 2015 at 11:34:26AM +0200, Alexander Graf wrote:
>>
>>
>> On 13.08.15 03:15, David Gibson wrote:
>>> ec53500f "kvm: Add VFIO device" added a special KVM pseudo-device which is
>>> used to handle any necessary interactions between KVM and VFIO.
>>>
>>> Currently that device is built on x86 and ARM, but not powerpc, although
>>> powerpc does support both KVM and VFIO.  This makes things awkward in
>>> userspace
>>>
>>> Currently qemu prints an alarming error message if you attempt to use VFIO
>>> and it can't initialize the KVM VFIO device.  We don't want to remove the
>>> warning, because lack of the KVM VFIO device could mean coherency problems
>>> on x86.  On powerpc, however, the error is harmless but looks disturbing,
>>> and a test based on host architecture in qemu would be ugly, and break if
>>> we do need the KVM VFIO device for something important in future.
>>>
>>> There's nothing preventing the KVM VFIO device from being built for
>>> powerpc, so this patch turns it on.  It won't actually do anything, since
>>> we don't define any of the arch_*() hooks, but it will make qemu happy and
>>> we can extend it in future if we need to.
>>>
>>> Signed-off-by: David Gibson 
>>> Reviewed-by: Eric Auger 
>>
>> Paul is going to take care of the kvm-ppc tree for 4.3. Also, ppc kvm
>> patches should get CC on the kvm-ppc@vger mailing list ;).
>>
>> Paul, could you please pick this one up?
> 
> Sure, I'll do that once I get home (end of this week).

This was not in the 4.3 pull request, but I think we can apply it after
the end of the merge window.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [TRIVIAL PATCH kvm-unit-tests] x86: desc: fix typo

2015-09-07 Thread Paolo Bonzini


On 26/08/2015 11:46, Levente Kurusa wrote:
> It should be 'exception' instead of 'excecption'.
> 
> Signed-off-by: Levente Kurusa 
> ---
>  lib/x86/desc.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/x86/desc.c b/lib/x86/desc.c
> index 3237778..e3bf175 100644
> --- a/lib/x86/desc.c
> +++ b/lib/x86/desc.c
> @@ -46,7 +46,7 @@ static void check_exception_table(struct ex_regs *regs)
>  return;
>  }
>  }
> -printf("unhandled excecption %d\n", regs->vector);
> +printf("unhandled exception %d\n", regs->vector);
>  exit(7);
>  }
>  
> @@ -68,7 +68,7 @@ void do_handle_exception(struct ex_regs *regs)
>   exception_handlers[regs->vector](regs);
>   return;
>   }
> - printf("unhandled cpu excecption %d\n", regs->vector);
> + printf("unhandled cpu exception %d\n", regs->vector);
>   if (regs->vector == 14)
>   printf("PF at %p addr %p\n", regs->rip, read_cr2());
>   exit(7);
> 

Applied, thanks.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V3 1/3] kvm: use kmalloc() instead of kzalloc() during iodev register/unregister

2015-09-07 Thread Paolo Bonzini


On 25/08/2015 11:05, Jason Wang wrote:
> All fields of kvm_io_range were initialized or copied explicitly
> afterwards. So switch to use kmalloc().
> 
> Cc: Gleb Natapov 
> Cc: Paolo Bonzini 
> Cc: Michael S. Tsirkin 
> Signed-off-by: Jason Wang 
> ---
>  virt/kvm/kvm_main.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 8b8a444..0d79fe8 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3248,7 +3248,7 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum 
> kvm_bus bus_idx, gpa_t addr,
>   if (bus->dev_count - bus->ioeventfd_count > NR_IOBUS_DEVS - 1)
>   return -ENOSPC;
>  
> - new_bus = kzalloc(sizeof(*bus) + ((bus->dev_count + 1) *
> + new_bus = kmalloc(sizeof(*bus) + ((bus->dev_count + 1) *
> sizeof(struct kvm_io_range)), GFP_KERNEL);
>   if (!new_bus)
>   return -ENOMEM;
> @@ -3280,7 +3280,7 @@ int kvm_io_bus_unregister_dev(struct kvm *kvm, enum 
> kvm_bus bus_idx,
>   if (r)
>   return r;
>  
> - new_bus = kzalloc(sizeof(*bus) + ((bus->dev_count - 1) *
> + new_bus = kmalloc(sizeof(*bus) + ((bus->dev_count - 1) *
> sizeof(struct kvm_io_range)), GFP_KERNEL);
>   if (!new_bus)
>   return -ENOMEM;
> 

Applied this patch for 4.4.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3 2/5] KVM: arm64: Implement vGICv3 distributor and redistributor access from userspace

2015-09-07 Thread Pavel Fedin
 Hello!

> > +   mmio.is_write = is_write;
> > +
> > +   if (attr->attr & KVM_DEV_ARM_VGIC_64BIT) {
> > +   u64 __user *uaddr = (u64 __user *)(long)attr->addr;
> > +   __le64 data;
> > +
> > +   if (is_write) {
> > +   u64 reg;
> > +
> > +   if (get_user(reg, uaddr))
> 
> Wouldn't the use of copy_from_user/copy_to_user here make this whole
> switch redundant? Even if not, you could think about moving this logic
> into into vgic_attr_regs_access() and just hardcode "len=4" into GICv2
> accesses.

 No, because endianess conversion is still strongly typed. And this ends up in:

if (is_write) {
u64 reg;
__le64 data;

copy_from_user(®, uaddr, mmio.len);

if (mmio.len == 8)
data = cpu_to_le64(reg);
else
*((__le32 *)&data) = cpu_to_le32(*((__le32 *)®);

isn't this even more ugly? Regardless of where it is placed.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Please pull my kvm-ppc-next branch

2015-09-07 Thread Paolo Bonzini


On 05/09/2015 09:45, Paul Mackerras wrote:
> Paolo,
> 
> Please pull the commits listed below into your tree.  I would like
> them to go in for 4.3 as they are all small bug fixes not new
> features, and they all can only affect HV-mode KVM on IBM server
> machines (in fact one has no effect on code at all since it is a typo
> fix for a comment).
> 
> Please let me know if you want me to re-post all the patches.
> 
> Thanks,
> Paul.
> 
> The following changes since commit e3dbc572fe11a5231568e106fa3dcedd1d1bec0f:
> 
>   Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into 
> kvm-queue (2015-08-22 14:57:59 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git 
> kvm-ppc-next
> 
> for you to fetch changes up to 4e33d1f0a145d48e8cf287954bbf791af8387cfb:
> 
>   KVM: PPC: Book3S: Fix typo in top comment about locking (2015-09-04 
> 07:28:05 +1000)
> 
> 
> Gautham R. Shenoy (2):
>   KVM: PPC: Book3S HV: Fix race in starting secondary threads
>   KVM: PPC: Book3S HV: Exit on H_DOORBELL if HOST_IPI is set
> 
> Greg Kurz (1):
>   KVM: PPC: Book3S: Fix typo in top comment about locking
> 
> Thomas Huth (1):
>   KVM: PPC: Book3S: Fix size of the PSPB register
> 
>  arch/powerpc/include/asm/kvm_host.h |  2 +-
>  arch/powerpc/kvm/book3s_hv.c| 10 +-
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |  9 +
>  arch/powerpc/kvm/book3s_xics.c  |  2 +-
>  4 files changed, 20 insertions(+), 3 deletions(-)
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Pulled, thanks.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3 2/5] KVM: arm64: Implement vGICv3 distributor and redistributor access from userspace

2015-09-07 Thread Pavel Fedin
 Hello!

> > --- a/virt/kvm/arm/vgic.c
> > +++ b/virt/kvm/arm/vgic.c
> > @@ -2468,6 +2468,7 @@ int vgic_attr_regs_access(struct kvm_device *dev,
> >  * access. For 64-bit registers we have to split up the operation.
> >  */
> > mmio->len = sizeof(u32);
> > +   mmio->private = vcpu; /* For redistributor handlers */
> 
> I guess this can be moved into the caller and then you can drop the vcpu
> parameter and use private here instead, no?

 No because 'vcpu' is not a parameter. It is figured out in the middle of the 
function, under
dev->kvm->lock mutex, out of 'cpuid' index.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


来自Hi的邮件

2015-09-07 Thread Hi
help
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html