Re: [PATCH v16 00/17] KVM RISC-V Support

2021-04-09 Thread Palmer Dabbelt

On Wed, 31 Mar 2021 02:21:58 PDT (-0700), pbonz...@redhat.com wrote:

On 30/03/21 07:48, Anup Patel wrote:


It seems Andrew does not want to freeze H-extension until we have virtualization
aware interrupt controller (such as RISC-V AIA specification) and IOMMU. Lot
of us feel that these things can be done independently because RISC-V
H-extension already has provisions for external interrupt controller with
virtualization support.


Sorry to hear that.  It's really gotten to a point where I'm just 
embarrassed with how the RISC-V foundation is being run -- not sure if 
these other ones bled into Linux land, but this is the third ISA 
extension that's blown up over the last few weeks.  We had a lot of 
discussion about this on the binutils/GCC side of things and I've 
managed to convince myself that coupling the software stack to the 
specification process isn't viable -- we made that decision under the 
assumption that specifications would actually progress through the 
process, but in practice that's just not happening.


My goal with the RISC-V stuff has always been getting us to a place 
where we have real shipping products running a software stack that is as 
close as possible to the upstream codebases.  I see that as the only way 
to get the software stack to a point where it can be sustainably 
maintained.  The "only frozen extensions" policy was meant to help this 
by steering vendors towards a common base we could support, but in 
practice it's just not working out.  The specification process is just 
so unreliable that in practice everything that gets built ends up 
relying on some non-standard behavior: whether it's a draft extension, 
some vendor-specific extension, or just some implementation quirks.  
There's always going to be some degree of that going on, but over the 
last year or so we've just stopped progressing.


My worry with accepting the draft extensions is that we have no 
guarantee of compatibility between various drafts, which makes 
supporting multiple versions much more difficult.  I've always really 
only been worried about supporting what gets implemented in a chip I can 
actually run code on, as I can at least guarantee that doesn't change.  
In practice that really has nothing to do with the specification freeze: 
even ratified specifications change in ways that break compatibility so 
we need to support multiple versions anyway.  That's why we've got 
things like the K210 support (which doesn't quite follow the ratified 
specs) and are going to take the errata stuff.  I hadn't been all that 
worried about the H support because there was a plan to get is to 
hardware, but with the change I'm not really sure how that's going to 
happen.



Yes, frankly that's pretty ridiculous as it's perfectly possible to
emulate the interrupt controller in software (and an IOMMU is not needed
at all if you are okay with emulated or paravirtualized devices---which
is almost always the case except for partitioning hypervisors).


There's certainly some risk to freezing the H extension before we have 
all flavors of systems up and running.  I spent a lot of time arguing 
that case years ago before we started telling people that the H 
extension just needed implementation, but that's not the decision we 
made.  I don't really do RISC-V foundation stuff any more so I don't 
know why this changed, but it's just too late.  It would be wonderful to 
have an implementation of everything we need to build out one of these 
complex systems, but I just just don't see how the current plan gets 
there: that's a huge amount of work and I don't see why anyone would 
commit to that when they can't count on it being supported when it's 
released.


There are clearly some systems that can be built with this as it stands.  
They're not going to satisfy every use case, but at least we'll get 
people to start seriously using the spec.  That's the only way I can see 
to move forward with this.  It's pretty clear that sitting around and 
waiting doesn't work, we've tried that.



Palmer, are you okay with merging RISC-V KVM?  Or should we place it in
drivers/staging/riscv/kvm?


I'm certainly ready to drop my objections to merging the code based on 
it targeting a draft extension, but at a bare minimum I want to get a 
new policy in place that everyone can agree to for merging code.  I've 
tried to draft up a new policy a handful of times this week, but I'm not 
really quite sure how to go about this: ultimately trying to build 
stable interfaces around an unstable ISA is just a losing battle.  I've 
got a bunch of stuff going on right now, but I'll try to find some time 
to actually sit down and finish one.


I know it might seem odd to complain about how slowly things are going 
and then throw up another roadblock, but I really do think this is a 
very important thing to get right.  I'm just not sure how we're going to 
get anywhere with RISC-V without someone providing stability, so I want 
to make sure that 

Re: [PATCH v16 00/17] KVM RISC-V Support

2021-04-01 Thread Anup Patel
On Wed, Mar 31, 2021 at 2:52 PM Paolo Bonzini  wrote:
>
> On 30/03/21 07:48, Anup Patel wrote:
> >
> > It seems Andrew does not want to freeze H-extension until we have 
> > virtualization
> > aware interrupt controller (such as RISC-V AIA specification) and IOMMU. Lot
> > of us feel that these things can be done independently because RISC-V
> > H-extension already has provisions for external interrupt controller with
> > virtualization support.
>
> Yes, frankly that's pretty ridiculous as it's perfectly possible to
> emulate the interrupt controller in software (and an IOMMU is not needed
> at all if you are okay with emulated or paravirtualized devices---which
> is almost always the case except for partitioning hypervisors).
>
> Palmer, are you okay with merging RISC-V KVM?  Or should we place it in
> drivers/staging/riscv/kvm?
>
> Either way, the best way to do it would be like this:
>
> 1) you apply patch 1 in a topic branch
>
> 2) you merge the topic branch in the risc-v tree
>
> 3) Anup merges the topic branch too and sends me a pull request.

In any case, I will send v17 based on Linux-5.12-rc5 so that people
can at least try KVM RISC-V based on latest kernel.

Regards,
Anup


Re: [PATCH v16 00/17] KVM RISC-V Support

2021-03-31 Thread Paolo Bonzini

On 30/03/21 07:48, Anup Patel wrote:


It seems Andrew does not want to freeze H-extension until we have virtualization
aware interrupt controller (such as RISC-V AIA specification) and IOMMU. Lot
of us feel that these things can be done independently because RISC-V
H-extension already has provisions for external interrupt controller with
virtualization support.


Yes, frankly that's pretty ridiculous as it's perfectly possible to 
emulate the interrupt controller in software (and an IOMMU is not needed 
at all if you are okay with emulated or paravirtualized devices---which 
is almost always the case except for partitioning hypervisors).


Palmer, are you okay with merging RISC-V KVM?  Or should we place it in 
drivers/staging/riscv/kvm?


Either way, the best way to do it would be like this:

1) you apply patch 1 in a topic branch

2) you merge the topic branch in the risc-v tree

3) Anup merges the topic branch too and sends me a pull request.

Paolo



Re: [PATCH v16 00/17] KVM RISC-V Support

2021-03-29 Thread Anup Patel
On Sat, Jan 23, 2021 at 9:10 AM Palmer Dabbelt  wrote:
>
> On Fri, 15 Jan 2021 04:18:29 PST (-0800), Anup Patel wrote:
> > This series adds initial KVM RISC-V support. Currently, we are able to boot
> > Linux on RV64/RV32 Guest with multiple VCPUs.
>
> Thanks.  IIUC the spec is still in limbo at the RISC-V foundation?  I haven't
> really been paying attention lately.

There is no change in H-extension spec for more than a year now.

The H-extension spec also has provision for external interrupt controller
with virtualization support (such as the RISC-V AIA specification).

It seems Andrew does not want to freeze H-extension until we have virtualization
aware interrupt controller (such as RISC-V AIA specification) and IOMMU. Lot
of us feel that these things can be done independently because RISC-V
H-extension already has provisions for external interrupt controller with
virtualization support.

The freeze criteria for H-extension is still not clear to me.
Refer, 
https://lists.riscv.org/g/tech-privileged/topic/risc_v_h_extension_freeze/80346318?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,80346318

Regards,
Anup

>
> >
> > Key aspects of KVM RISC-V added by this series are:
> > 1. No RISC-V specific KVM IOCTL
> > 2. Minimal possible KVM world-switch which touches only GPRs and few CSRs
> > 3. Both RV64 and RV32 host supported
> > 4. Full Guest/VM switch is done via vcpu_get/vcpu_put infrastructure
> > 5. KVM ONE_REG interface for VCPU register access from user-space
> > 6. PLIC emulation is done in user-space
> > 7. Timer and IPI emuation is done in-kernel
> > 8. Both Sv39x4 and Sv48x4 supported for RV64 host
> > 9. MMU notifiers supported
> > 10. Generic dirtylog supported
> > 11. FP lazy save/restore supported
> > 12. SBI v0.1 emulation for KVM Guest available
> > 13. Forward unhandled SBI calls to KVM userspace
> > 14. Hugepage support for Guest/VM
> > 15. IOEVENTFD support for Vhost
> >
> > Here's a brief TODO list which we will work upon after this series:
> > 1. SBI v0.2 emulation in-kernel
> > 2. SBI v0.2 hart state management emulation in-kernel
> > 3. In-kernel PLIC emulation
> > 4. . and more .
> >
> > This series can be found in riscv_kvm_v16 branch at:
> > https//github.com/avpatel/linux.git
> >
> > Our work-in-progress KVMTOOL RISC-V port can be found in riscv_v6 branch
> > at: https//github.com/avpatel/kvmtool.git
> >
> > The QEMU RISC-V hypervisor emulation is done by Alistair and is available
> > in master branch at: https://git.qemu.org/git/qemu.git
> >
> > To play around with KVM RISC-V, refer KVM RISC-V wiki at:
> > https://github.com/kvm-riscv/howto/wiki
> > https://github.com/kvm-riscv/howto/wiki/KVM-RISCV64-on-QEMU
> > https://github.com/kvm-riscv/howto/wiki/KVM-RISCV64-on-Spike
> >
> > Changes since v15:
> >  - Rebased on Linux-5.11-rc3
> >  - Fixed kvm_stage2_map() to use gfn_to_pfn_prot() for determing
> >writeability of a host pfn.
> >  - Use "__u64" in-place of "u64" and "__u32" in-place of "u32" for
> >uapi/asm/kvm.h
> >
> > Changes since v14:
> >  - Rebased on Linux-5.10-rc3
> >  - Fixed Stage2 (G-stage) PDG allocation to ensure it is 16KB aligned
> >
> > Changes since v13:
> >  - Rebased on Linux-5.9-rc3
> >  - Fixed kvm_riscv_vcpu_set_reg_csr() for SIP updation in PATCH5
> >  - Fixed instruction length computation in PATCH7
> >  - Added ioeventfd support in PATCH7
> >  - Ensure HSTATUS.SPVP is set to correct value before using HLV/HSV
> >intructions in PATCH7
> >  - Fixed stage2_map_page() to set PTE 'A' and 'D' bits correctly
> >in PATCH10
> >  - Added stage2 dirty page logging in PATCH10
> >  - Allow KVM user-space to SET/GET SCOUNTER CSR in PATCH5
> >  - Save/restore SCOUNTEREN in PATCH6
> >  - Reduced quite a few instructions for __kvm_riscv_switch_to() by
> >using CSR swap instruction in PATCH6
> >  - Detect and use Sv48x4 when available in PATCH10
> >
> > Changes since v12:
> >  - Rebased patches on Linux-5.8-rc4
> >  - By default enable all counters in HCOUNTEREN
> >  - RISC-V H-Extension v0.6.1 spec support
> >
> > Changes since v11:
> >  - Rebased patches on Linux-5.7-rc3
> >  - Fixed typo in typecast of stage2_map_size define
> >  - Introduced struct kvm_cpu_trap to represent trap details and
> >use it as function parameter wherever applicable
> >  - Pass memslot to kvm_riscv_stage2_map() for supporing dirty page
> >logging in future
> >  - RISC-V H-Extension v0.6 spec support
> >  - Send-out first three patches as separate series so that it can
> >be taken by Palmer for Linux RISC-V
> >
> > Changes since v10:
> >  - Rebased patches on Linux-5.6-rc5
> >  - Reduce RISCV_ISA_EXT_MAX from 256 to 64
> >  - Separate PATCH for removing N-extension related defines
> >  - Added comments as requested by Palmer
> >  - Fixed HIDELEG CSR programming
> >
> > Changes since v9:
> >  - Rebased patches on Linux-5.5-rc3
> >  - Squash PATCH19 and PATCH20 into PATCH5
> >  - Squash PATCH18 into PATCH11
> >  - Squash PATCH17 into PATCH16
> >  - 

Re: [PATCH v16 00/17] KVM RISC-V Support

2021-01-22 Thread Palmer Dabbelt

On Fri, 15 Jan 2021 04:18:29 PST (-0800), Anup Patel wrote:

This series adds initial KVM RISC-V support. Currently, we are able to boot
Linux on RV64/RV32 Guest with multiple VCPUs.


Thanks.  IIUC the spec is still in limbo at the RISC-V foundation?  I haven't
really been paying attention lately.



Key aspects of KVM RISC-V added by this series are:
1. No RISC-V specific KVM IOCTL
2. Minimal possible KVM world-switch which touches only GPRs and few CSRs
3. Both RV64 and RV32 host supported
4. Full Guest/VM switch is done via vcpu_get/vcpu_put infrastructure
5. KVM ONE_REG interface for VCPU register access from user-space
6. PLIC emulation is done in user-space
7. Timer and IPI emuation is done in-kernel
8. Both Sv39x4 and Sv48x4 supported for RV64 host
9. MMU notifiers supported
10. Generic dirtylog supported
11. FP lazy save/restore supported
12. SBI v0.1 emulation for KVM Guest available
13. Forward unhandled SBI calls to KVM userspace
14. Hugepage support for Guest/VM
15. IOEVENTFD support for Vhost

Here's a brief TODO list which we will work upon after this series:
1. SBI v0.2 emulation in-kernel
2. SBI v0.2 hart state management emulation in-kernel
3. In-kernel PLIC emulation
4. . and more .

This series can be found in riscv_kvm_v16 branch at:
https//github.com/avpatel/linux.git

Our work-in-progress KVMTOOL RISC-V port can be found in riscv_v6 branch
at: https//github.com/avpatel/kvmtool.git

The QEMU RISC-V hypervisor emulation is done by Alistair and is available
in master branch at: https://git.qemu.org/git/qemu.git

To play around with KVM RISC-V, refer KVM RISC-V wiki at:
https://github.com/kvm-riscv/howto/wiki
https://github.com/kvm-riscv/howto/wiki/KVM-RISCV64-on-QEMU
https://github.com/kvm-riscv/howto/wiki/KVM-RISCV64-on-Spike

Changes since v15:
 - Rebased on Linux-5.11-rc3
 - Fixed kvm_stage2_map() to use gfn_to_pfn_prot() for determing
   writeability of a host pfn.
 - Use "__u64" in-place of "u64" and "__u32" in-place of "u32" for
   uapi/asm/kvm.h

Changes since v14:
 - Rebased on Linux-5.10-rc3
 - Fixed Stage2 (G-stage) PDG allocation to ensure it is 16KB aligned

Changes since v13:
 - Rebased on Linux-5.9-rc3
 - Fixed kvm_riscv_vcpu_set_reg_csr() for SIP updation in PATCH5
 - Fixed instruction length computation in PATCH7
 - Added ioeventfd support in PATCH7
 - Ensure HSTATUS.SPVP is set to correct value before using HLV/HSV
   intructions in PATCH7
 - Fixed stage2_map_page() to set PTE 'A' and 'D' bits correctly
   in PATCH10
 - Added stage2 dirty page logging in PATCH10
 - Allow KVM user-space to SET/GET SCOUNTER CSR in PATCH5
 - Save/restore SCOUNTEREN in PATCH6
 - Reduced quite a few instructions for __kvm_riscv_switch_to() by
   using CSR swap instruction in PATCH6
 - Detect and use Sv48x4 when available in PATCH10

Changes since v12:
 - Rebased patches on Linux-5.8-rc4
 - By default enable all counters in HCOUNTEREN
 - RISC-V H-Extension v0.6.1 spec support

Changes since v11:
 - Rebased patches on Linux-5.7-rc3
 - Fixed typo in typecast of stage2_map_size define
 - Introduced struct kvm_cpu_trap to represent trap details and
   use it as function parameter wherever applicable
 - Pass memslot to kvm_riscv_stage2_map() for supporing dirty page
   logging in future
 - RISC-V H-Extension v0.6 spec support
 - Send-out first three patches as separate series so that it can
   be taken by Palmer for Linux RISC-V

Changes since v10:
 - Rebased patches on Linux-5.6-rc5
 - Reduce RISCV_ISA_EXT_MAX from 256 to 64
 - Separate PATCH for removing N-extension related defines
 - Added comments as requested by Palmer
 - Fixed HIDELEG CSR programming

Changes since v9:
 - Rebased patches on Linux-5.5-rc3
 - Squash PATCH19 and PATCH20 into PATCH5
 - Squash PATCH18 into PATCH11
 - Squash PATCH17 into PATCH16
 - Added ONE_REG interface for VCPU timer in PATCH13
 - Use HTIMEDELTA for VCPU timer in PATCH13
 - Updated KVM RISC-V mailing list in MAINTAINERS entry
 - Update KVM kconfig option to depend on RISCV_SBI and MMU
 - Check for SBI v0.2 and SBI v0.2 RFENCE extension at boot-time
 - Use SBI v0.2 RFENCE extension in VMID implementation
 - Use SBI v0.2 RFENCE extension in Stage2 MMU implementation
 - Use SBI v0.2 RFENCE extension in SBI implementation
 - Moved to RISC-V Hypervisor v0.5 draft spec
 - Updated Documentation/virt/kvm/api.txt for timer ONE_REG interface

Changes since v8:
 - Rebased series on Linux-5.4-rc3 and Atish's SBI v0.2 patches
 - Use HRTIMER_MODE_REL instead of HRTIMER_MODE_ABS in timer emulation
 - Fixed kvm_riscv_stage2_map() to handle hugepages
 - Added patch to forward unhandled SBI calls to user-space
 - Added patch for iterative/recursive stage2 page table programming
 - Added patch to remove per-CPU vsip_shadow variable
 - Added patch to fix race-condition in kvm_riscv_vcpu_sync_interrupts()

Changes since v7:
 - Rebased series on Linux-5.4-rc1 and Atish's SBI v0.2 patches
 - Removed PATCH1, PATCH3, and PATCH20 because these already merged
 - Use kernel doc 

[PATCH v16 00/17] KVM RISC-V Support

2021-01-15 Thread Anup Patel
This series adds initial KVM RISC-V support. Currently, we are able to boot
Linux on RV64/RV32 Guest with multiple VCPUs.

Key aspects of KVM RISC-V added by this series are:
1. No RISC-V specific KVM IOCTL
2. Minimal possible KVM world-switch which touches only GPRs and few CSRs
3. Both RV64 and RV32 host supported
4. Full Guest/VM switch is done via vcpu_get/vcpu_put infrastructure
5. KVM ONE_REG interface for VCPU register access from user-space
6. PLIC emulation is done in user-space
7. Timer and IPI emuation is done in-kernel
8. Both Sv39x4 and Sv48x4 supported for RV64 host
9. MMU notifiers supported
10. Generic dirtylog supported
11. FP lazy save/restore supported
12. SBI v0.1 emulation for KVM Guest available
13. Forward unhandled SBI calls to KVM userspace
14. Hugepage support for Guest/VM
15. IOEVENTFD support for Vhost

Here's a brief TODO list which we will work upon after this series:
1. SBI v0.2 emulation in-kernel
2. SBI v0.2 hart state management emulation in-kernel
3. In-kernel PLIC emulation
4. . and more .

This series can be found in riscv_kvm_v16 branch at:
https//github.com/avpatel/linux.git

Our work-in-progress KVMTOOL RISC-V port can be found in riscv_v6 branch
at: https//github.com/avpatel/kvmtool.git

The QEMU RISC-V hypervisor emulation is done by Alistair and is available
in master branch at: https://git.qemu.org/git/qemu.git

To play around with KVM RISC-V, refer KVM RISC-V wiki at:
https://github.com/kvm-riscv/howto/wiki
https://github.com/kvm-riscv/howto/wiki/KVM-RISCV64-on-QEMU
https://github.com/kvm-riscv/howto/wiki/KVM-RISCV64-on-Spike

Changes since v15:
 - Rebased on Linux-5.11-rc3
 - Fixed kvm_stage2_map() to use gfn_to_pfn_prot() for determing
   writeability of a host pfn.
 - Use "__u64" in-place of "u64" and "__u32" in-place of "u32" for
   uapi/asm/kvm.h

Changes since v14:
 - Rebased on Linux-5.10-rc3
 - Fixed Stage2 (G-stage) PDG allocation to ensure it is 16KB aligned

Changes since v13:
 - Rebased on Linux-5.9-rc3
 - Fixed kvm_riscv_vcpu_set_reg_csr() for SIP updation in PATCH5
 - Fixed instruction length computation in PATCH7
 - Added ioeventfd support in PATCH7
 - Ensure HSTATUS.SPVP is set to correct value before using HLV/HSV
   intructions in PATCH7
 - Fixed stage2_map_page() to set PTE 'A' and 'D' bits correctly
   in PATCH10
 - Added stage2 dirty page logging in PATCH10
 - Allow KVM user-space to SET/GET SCOUNTER CSR in PATCH5
 - Save/restore SCOUNTEREN in PATCH6
 - Reduced quite a few instructions for __kvm_riscv_switch_to() by
   using CSR swap instruction in PATCH6
 - Detect and use Sv48x4 when available in PATCH10

Changes since v12:
 - Rebased patches on Linux-5.8-rc4
 - By default enable all counters in HCOUNTEREN
 - RISC-V H-Extension v0.6.1 spec support

Changes since v11:
 - Rebased patches on Linux-5.7-rc3
 - Fixed typo in typecast of stage2_map_size define
 - Introduced struct kvm_cpu_trap to represent trap details and
   use it as function parameter wherever applicable
 - Pass memslot to kvm_riscv_stage2_map() for supporing dirty page
   logging in future
 - RISC-V H-Extension v0.6 spec support
 - Send-out first three patches as separate series so that it can
   be taken by Palmer for Linux RISC-V

Changes since v10:
 - Rebased patches on Linux-5.6-rc5
 - Reduce RISCV_ISA_EXT_MAX from 256 to 64
 - Separate PATCH for removing N-extension related defines
 - Added comments as requested by Palmer
 - Fixed HIDELEG CSR programming

Changes since v9:
 - Rebased patches on Linux-5.5-rc3
 - Squash PATCH19 and PATCH20 into PATCH5
 - Squash PATCH18 into PATCH11
 - Squash PATCH17 into PATCH16
 - Added ONE_REG interface for VCPU timer in PATCH13
 - Use HTIMEDELTA for VCPU timer in PATCH13
 - Updated KVM RISC-V mailing list in MAINTAINERS entry
 - Update KVM kconfig option to depend on RISCV_SBI and MMU
 - Check for SBI v0.2 and SBI v0.2 RFENCE extension at boot-time
 - Use SBI v0.2 RFENCE extension in VMID implementation
 - Use SBI v0.2 RFENCE extension in Stage2 MMU implementation
 - Use SBI v0.2 RFENCE extension in SBI implementation
 - Moved to RISC-V Hypervisor v0.5 draft spec
 - Updated Documentation/virt/kvm/api.txt for timer ONE_REG interface

Changes since v8:
 - Rebased series on Linux-5.4-rc3 and Atish's SBI v0.2 patches
 - Use HRTIMER_MODE_REL instead of HRTIMER_MODE_ABS in timer emulation
 - Fixed kvm_riscv_stage2_map() to handle hugepages
 - Added patch to forward unhandled SBI calls to user-space
 - Added patch for iterative/recursive stage2 page table programming
 - Added patch to remove per-CPU vsip_shadow variable
 - Added patch to fix race-condition in kvm_riscv_vcpu_sync_interrupts()

Changes since v7:
 - Rebased series on Linux-5.4-rc1 and Atish's SBI v0.2 patches
 - Removed PATCH1, PATCH3, and PATCH20 because these already merged
 - Use kernel doc style comments for ISA bitmap functions
 - Don't parse X, Y, and Z extension in riscv_fill_hwcap() because it will
   be added in-future
 - Mark KVM RISC-V kconfig option as