On Sun, Feb 10, 2019 at 10:50 PM Waldek Kozaczuk <[email protected]>
wrote:

> This patch is not meant to be applied but it serves to illustrate what is
> left to make OSv boot on firecracker. As you can see it is not a daunting
> amount of work anymore given modern virtio is supported now.
>
> In am not sure in what order we should implement the remaining work.
> Ideally every patch should be testable
>

Indeed, although I don't mind that the firecracker-specific stuff you'll be
the only one testing it (since I'm not planning to run it any time soon).

Maybe you can even add to run.py a firecracker hypersor options, so if
someone just installs this hypervisor he can use run.py?



> which means that virtio-mmio support should be added last given it is not
> possible to test it on QEMU (this
> https://github.com/qemu/qemu/blob/595123df1d54ed8fbab9e1a73d5a58c5bb71058f/hw/virtio/virtio-mmio.
> <https://github.com/qemu/qemu/blob/595123df1d54ed8fbab9e1a73d5a58c5bb71058f/hw/virtio/virtio-mmio.c>
>
seems to suggest that QEMU is able to emulate virtio-mmio devices but I
> have not found any examples of using it on x86-64 platform, anyone has
> ideas?). Please see my extra comments/questions below.
>
> On Sunday, February 10, 2019 at 3:22:44 PM UTC-5, Waldek Kozaczuk wrote:
>>
>> - clean up modern/legacy virtio support (configurability)
>>
> This mostly is about fragments of code in virtio layer similar to this -
> https://github.com/cloudius-systems/osv/blob/master/drivers/virtio-blk.cc#L177-L187
> - where code needs to be tweaked to properly identify from feature flags
> what configuration data is available and only read this.
>
>> - add virtio-mmio support including parsing virtio-mmio devices from
>> command line
>> - enhance boot.S and OSv boot in general to support vmlinux format
>> (decompressed ELF 64-bit Linux kernel)
>>    - pass e820 and command line parameters through zero-page
>>
>  This is most non-trivial amount of work that requires changing OSv boot
> protocol so it better conforms to a Linux (for reference
> https://www.kernel.org/doc/Documentation/x86/boot.txt). It would be nice
> to support both 32-bit and 64-bit ELF direct boot LInux protocol. It is not
> clear to me what is the best way to support it especially if we still want
> to be able to boot OSv in real mode, protected and long mode at the same
> time and ideally minimize number of versions of boot artifacts (usr.img,
> lzloader.elf, loader.elf, etc). Maybe this deserves its own email thread.
>
>> - make APCI optional and/or disable-able which includes:
>>    - make pvpanic probe only if ACPI on
>>    - make poweroff work when ACPI off
>>    - parse SMP information from MP table instead of MADT if ACPI off
>>
>
> I think that all the 3 ACPI related items above can be part of a single
> patch.
>
>> - add option to turn PCI off (make OSv not even try to enumerate PCI)
>> - mark boot time end
>
>
>> Signed-off-by: Waldemar Kozaczuk <[email protected]>
>> ---
>>  Makefile               |   1 +
>>  arch/x64/arch-setup.cc | 180 ++++++++++++++++++++++++++++++++++++-----
>>  arch/x64/boot.S        |  37 ++++++++-
>>  arch/x64/loader.ld     |   2 +-
>>  arch/x64/power.cc      |  14 +---
>>  arch/x64/smp.cc        |  16 +++-
>>  drivers/acpi.cc        |  10 ++-
>>  drivers/virtio-blk.cc  |   8 +-
>>  drivers/virtio-mmio.cc | 131 ++++++++++++++++++++++++++++++
>>  drivers/virtio-mmio.hh | 148 +++++++++++++++++++++++++++++++++
>>  drivers/virtio-net.cc  |   7 ++
>>  drivers/virtio.cc      |   7 +-
>>  drivers/virtio.hh      |   1 +
>>  loader.cc              |   5 +-
>>  14 files changed, 527 insertions(+), 40 deletions(-)
>>  create mode 100644 drivers/virtio-mmio.cc
>>  create mode 100644 drivers/virtio-mmio.hh
>>
>> diff --git a/Makefile b/Makefile
>> index ff9f2ed3..b905063c 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -821,6 +821,7 @@ drivers += arch/$(arch)/pvclock-abi.o
>>  drivers += drivers/virtio.o
>>  drivers += drivers/virtio-pci-device.o
>>  drivers += drivers/virtio-vring.o
>> +drivers += drivers/virtio-mmio.o
>>  drivers += drivers/virtio-net.o
>>  drivers += drivers/vmxnet3.o
>>  drivers += drivers/vmxnet3-queues.o
>> diff --git a/arch/x64/arch-setup.cc b/arch/x64/arch-setup.cc
>> index 50500662..cf5adb15 100644
>> --- a/arch/x64/arch-setup.cc
>> +++ b/arch/x64/arch-setup.cc
>> @@ -22,6 +22,19 @@
>>  #include <osv/commands.hh>
>>  #include "dmi.hh"
>>
>> +// Not sure if Linux zero page is always located at this place
>> +// in memory or its address is passed in one of the registers
>> +// -> double check
>> +#define ZERO_PAGE_START      0x7000
>> +#define SETUP_HEADER_OFFSET  0x1f1   // look at bootparam.h in linux
>> +#define BOOT_FLAG_OFFSET     sizeof(u8) + 4 * sizeof(u16) + sizeof(u32)
>> +
>> +#define E820_ENTRIES_OFFSET  0x1e8   // look at bootparam.h in linux
>> +#define E820_TABLE_OFFSET    0x2d0   // look at bootparam.h in linux
>> +
>> +#define CMD_LINE_PTR_OFFSET  sizeof(u8) * 5 + sizeof(u16) * 11 +
>> sizeof(u32) * 7
>> +#define CMD_LINE_SIZE_OFFSET CMD_LINE_PTR_OFFSET + sizeof(u8) * 2 +
>> sizeof(u16) + sizeof(u32) * 3
>> +
>>  struct multiboot_info_type {
>>      u32 flags;
>>      u32 mem_lower;
>> @@ -61,12 +74,81 @@ struct e820ent {
>>      u32 type;
>>  } __attribute__((packed));
>>
>> +struct _e820ent {
>> +    u64 addr;
>> +    u64 size;
>> +    u32 type;
>> +} __attribute__((packed));
>> +
>>  osv_multiboot_info_type* osv_multiboot_info;
>>
>> -void parse_cmdline(multiboot_info_type& mb)
>> +struct mmio_device_info {
>> +    u64 address;
>> +    u64 size;
>> +    unsigned int irq;
>> +};
>> +
>> +//TODO: For now we are limiting number of mmio devices to two
>> +// Ideally we should be using somewhat more dynamic structure
>> +struct mmio_device_info mmio_device_info_entries[2];
>> +int mmio_device_info_count = 0;
>> +
>> +#define VIRTIO_MMIO_DEVICE_CMDLINE_PREFIX "virtio_mmio.device="
>> +char* parse_mmio_device_info(char *cmdline, mmio_device_info *info) {
>> +    // [virtio_mmio.]device=<size>@<baseaddr>:<irq>[:<id>]
>> +    char *prefix_pos =
>> strstr(cmdline,VIRTIO_MMIO_DEVICE_CMDLINE_PREFIX);
>> +    if (!prefix_pos)
>> +        return nullptr;
>> +
>> +    char *size_pos = prefix_pos +
>> strlen(VIRTIO_MMIO_DEVICE_CMDLINE_PREFIX);
>> +    if (sscanf(size_pos,"%ld", &info->size) != 1)
>> +        return nullptr;
>> +
>> +    char *at_pos = strstr(size_pos,"@");
>> +    if (!at_pos)
>> +        return nullptr;
>> +
>> +    switch(*(at_pos - 1)) {
>> +        case 'k':
>> +        case 'K':
>> +            info->size = info->size * 1024;
>> +            break;
>> +        case 'm':
>> +        case 'M':
>> +            info->size = info->size * 1024 * 1024;
>> +            break;
>> +        default:
>> +            break;
>> +    }
>> +
>> +    if (sscanf(at_pos, "@%lli:%u", &info->address, &info->irq) == 2)
>> +        return prefix_pos;
>> +    else
>> +        return nullptr;
>> +}
>> +
>> +//void parse_cmdline(multiboot_info_type& mb)
>> +void parse_cmdline(char *cmdline)
>>  {
>> -    auto p = reinterpret_cast<char*>(mb.cmdline);
>> -    osv::parse_cmdline(p);
>> +    //auto p = reinterpret_cast<char*>(mb.cmdline);
>> +    // We are assuming the mmio devices information is appended to the
>> +    // command line (at least it is the case with the firecracker) so
>> +    // once we parse those we strip it away so only plain OSv command
>> line
>> +    // is left
>> +    //TODO: There may be a smarter, better way to parse this information
>> +    char *virtio_device_info_pos =
>> parse_mmio_device_info(cmdline,mmio_device_info_entries);
>> +    if (virtio_device_info_pos) {
>> +        mmio_device_info_count++;
>> +        *virtio_device_info_pos = 0;
>> +
>> +        virtio_device_info_pos =
>> +            parse_mmio_device_info(virtio_device_info_pos +
>> 1,mmio_device_info_entries + 1);
>> +        if (virtio_device_info_pos) {
>> +            mmio_device_info_count++;
>> +        }
>> +    }
>> +
>> +    osv::parse_cmdline(cmdline);
>>  }
>>
>>  void setup_temporary_phys_map()
>> @@ -121,28 +203,64 @@ void arch_setup_free_memory()
>>  {
>>      static ulong edata;
>>      asm ("movl $.edata, %0" : "=rm"(edata));
>> -    // copy to stack so we don't free it now
>> -    auto omb = *osv_multiboot_info;
>> -    auto mb = omb.mb;
>> -    auto e820_buffer = alloca(mb.mmap_length);
>> -    auto e820_size = mb.mmap_length;
>> -    memcpy(e820_buffer, reinterpret_cast<void*>(mb.mmap_addr),
>> e820_size);
>> +
>> +    void *zero_page = reinterpret_cast<void*>(ZERO_PAGE_START);
>> +    void *setup_header = zero_page + SETUP_HEADER_OFFSET;
>> +
>> +    // Grab command line from zero page
>> +    u32 cmdline_ptr = *static_cast<u32*>(setup_header +
>> CMD_LINE_PTR_OFFSET);
>> +    u32 cmdline_size = *static_cast<u32*>(setup_header +
>> CMD_LINE_SIZE_OFFSET);
>> +
>> +    // Copy cmdline from zero page
>> +    void* cmdline = reinterpret_cast<void*>((u64)cmdline_ptr);
>> +    void *cmdline_copy = alloca(cmdline_size + 1);
>> +    memcpy(cmdline_copy,cmdline,cmdline_size);
>> +    ((char*)cmdline_copy)[cmdline_size] = 0;
>> +
>> +    debug_early("Cmdline: ");
>> +    debug_early((char*)cmdline_copy);
>> +    debug_early("\n");
>> +
>> +    // Copy e820 information from zero page
>> +    struct _e820ent *e820_table = static_cast<struct _e820ent
>> *>(zero_page + E820_TABLE_OFFSET);
>> +
>> +    //TODO: We are assuming two entries but in reality
>> +    //there could be more so this logic below needs to be a little
>> smarter
>> +    auto e820_size = 48;
>> +    auto e820_buffer = alloca(e820_size);
>> +    {
>> +       struct e820ent *lower = reinterpret_cast<struct
>> e820ent*>(e820_buffer);
>> +       lower->ent_size = 20;
>> +       lower->type = 1;
>> +       lower->addr = e820_table[0].addr;
>> +       lower->size = e820_table[0].size;
>> +
>> +       struct e820ent *upper = lower + 1;
>> +       upper->ent_size = 20;
>> +       upper->type = 1;
>> +       upper->addr = e820_table[1].addr;
>> +       upper->size = e820_table[1].size;
>> +    }
>> +
>>      for_each_e820_entry(e820_buffer, e820_size, [] (e820ent ent) {
>>          memory::phys_mem_size += ent.size;
>>      });
>>      constexpr u64 initial_map = 1 << 30; // 1GB mapped by startup code
>>
>> -    u64 time;
>> -    time = omb.tsc_init_hi;
>> -    time = (time << 32) | omb.tsc_init;
>> +    //TODO: We are assuming that bootchart-wise we start here but
>> +    // in reality it all starts at boot.S:start64_. However what happens
>> +    // before this points should take negligible amount of time, no?
>> +    u64 time = 0;
>> +    //time = omb.tsc_init_hi;
>> +    //time = (time << 32) | omb.tsc_init;
>>      boot_time.event(0, "", time );
>>
>> -    time = omb.tsc_disk_done_hi;
>> -    time = (time << 32) | omb.tsc_disk_done;
>> +    //time = omb.tsc_disk_done_hi;
>> +    //time = (time << 32) | omb.tsc_disk_done;
>>      boot_time.event(1, "disk read (real mode)", time );
>>
>> -    time = omb.tsc_uncompress_done_hi;
>> -    time = (time << 32) | omb.tsc_uncompress_done;
>> +    //time = omb.tsc_uncompress_done_hi;
>> +    //time = (time << 32) | omb.tsc_uncompress_done;
>>      boot_time.event(2, "uncompress lzloader.elf", time );
>>
>>      auto c = processor::cpuid(0x80000000);
>> @@ -185,7 +303,10 @@ void arch_setup_free_memory()
>>      elf_size = edata - elf_phys;
>>      mmu::linear_map(elf_start, elf_phys, elf_size, OSV_KERNEL_BASE);
>>      // get rid of the command line, before low memory is unmapped
>> -    parse_cmdline(mb);
>> +    //parse_cmdline(mb);
>> +
>> +    parse_cmdline((char*)cmdline_copy);
>> +
>>      // now that we have some free memory, we can start mapping the rest
>>      mmu::switch_to_runtime_page_tables();
>>      for_each_e820_entry(e820_buffer, e820_size, [] (e820ent ent) {
>> @@ -259,6 +380,7 @@ void arch_init_premain()
>>  #include "drivers/virtio-scsi.hh"
>>  #include "drivers/virtio-net.hh"
>>  #include "drivers/virtio-rng.hh"
>> +#include "drivers/virtio-mmio.hh"
>>  #include "drivers/xenplatform-pci.hh"
>>  #include "drivers/ahci.hh"
>>  #include "drivers/vmw-pvscsi.hh"
>> @@ -268,13 +390,33 @@ void arch_init_premain()
>>  void arch_init_drivers()
>>  {
>>      // initialize panic drivers
>> -    panic::pvpanic::probe_and_setup();
>> +    // pvpanic depends on ACPI which firecracker
>> +    // does not suppot so we disable probing it altogether
>> +    //TODO: Is there a way to detect if ACPI is available and
>> +    //only then probe pvpanic?
>> +    //panic::pvpanic::probe_and_setup();
>>      boot_time.event("pvpanic done");
>>
>>      // Enumerate PCI devices
>> -    pci::pci_device_enumeration();
>> +    // PCI is not supported by firecracker
>> +    //TODO: Is there a way to detect if PCI is present and only
>> enumerate
>> +    //PCI devices then? Somehow even firecracker presents a bus with
>> +    //some dummy devices.
>> +    //pci::pci_device_enumeration();
>>      boot_time.event("pci enumerated");
>>
>> +    // Register any parsed virtio-mmio devices
>> +    for (int d = 0; d < mmio_device_info_count; d++) {
>> +        auto info = mmio_device_info_entries[d];
>> +        auto mmio_device = new virtio::mmio_device(info.address,
>> info.size, info.irq);
>> +        if (mmio_device->parse_config()) {
>> +            device_manager::instance()->register_device(mmio_device);
>> +        }
>> +        else {
>> +            delete mmio_device;
>> +        }
>> +    }
>> +
>>      // Initialize all drivers
>>      hw::driver_manager* drvman = hw::driver_manager::instance();
>>      drvman->register_driver(virtio::blk::probe);
>> diff --git a/arch/x64/boot.S b/arch/x64/boot.S
>> index c4b97e2d..3bdc57ea 100644
>> --- a/arch/x64/boot.S
>> +++ b/arch/x64/boot.S
>> @@ -97,9 +97,10 @@ start64:
>>      sub %rdi, %rcx
>>      xor %eax, %eax
>>      rep stosb
>> +    mov $0x200000, %rbp
>>      mov %rbp, elf_header
>>      # %ebx is set by boot16.S before running the loader
>> -    mov %rbx, osv_multiboot_info
>> +    //mov %rbx, osv_multiboot_info
>>      lea init_stack_top, %rsp
>>      call premain
>>      mov __loader_argc, %edi
>> @@ -107,8 +108,42 @@ start64:
>>      call main
>>      .cfi_endproc
>>
>> +.code64
>> +.global _start64
>> +_start64:
>> +//TODO: Is there a way to switch to protected mode and then jump to
>> start32
>> +//which would be even better than what we do below?
>> +    //For whatever reason at this point (long mode?) we cannot
>> +    //set gdt the way it is done in start32 but linux expects
>> +    //similar one so whatever firecracker sets seems to be OK.
>> +    //Maybe because OSv gdt has 32-bit entry
>> +    //lgdt gdt_desc
>> +    //mov $0x10, %eax
>> +    //mov %eax, %ds
>> +    //mov %eax, %es
>> +    //mov %eax, %fs
>> +    //mov %eax, %gs
>> +    //mov %eax, %ss
>> +    // Disable paging and enable PAE
>> +    mov $BOOT_CR4, %eax
>> +    mov %eax, %cr4
>> +    // Setup page tables
>> +    lea ident_pt_l4, %eax
>> +    mov %eax, %cr3
>> +    // Write contents of EDX:EAX to Model Specific Register specified by
>> ECX register
>> +    mov $0xc0000080, %ecx
>> +    mov $0x00000900, %eax
>> +    xor %edx, %edx
>> +    wrmsr
>> +    // Enable paging
>> +    mov $BOOT_CR0, %eax
>> +    mov %eax, %cr0
>> +    jmp start64
>> +
>>  # The smp trampoline must be in the lower 1MB, so we manually relocate
>>  # it to address 0 by subtracting smpboot from any offset
>> +//TODO: I am pretty sure we have some logic missing for SMP
>> +//given firecracker jumps us into long mode. Not sure how to handle it.
>>  .data
>>  .global smpboot
>>  smpboot:
>> diff --git a/arch/x64/loader.ld b/arch/x64/loader.ld
>> index efe78d52..e08b310c 100644
>> --- a/arch/x64/loader.ld
>> +++ b/arch/x64/loader.ld
>> @@ -108,4 +108,4 @@ PHDRS {
>>          eh_frame PT_GNU_EH_FRAME;
>>          note PT_NOTE;
>>  }
>> -ENTRY(start32);
>> +ENTRY(_start64);
>> diff --git a/arch/x64/power.cc b/arch/x64/power.cc
>> index 81849335..534ffdf0 100644
>> --- a/arch/x64/power.cc
>> +++ b/arch/x64/power.cc
>> @@ -27,17 +27,9 @@ void halt(void)
>>
>>  void poweroff(void)
>>  {
>> -    ACPI_STATUS status = AcpiEnterSleepStatePrep(ACPI_STATE_S5);
>> -    if (ACPI_FAILURE(status)) {
>> -        debug("AcpiEnterSleepStatePrep failed: %s\n",
>> AcpiFormatException(status));
>> -        halt();
>> -    }
>> -    status = AcpiEnterSleepState(ACPI_STATE_S5);
>> -    if (ACPI_FAILURE(status)) {
>> -        debug("AcpiEnterSleepState failed: %s\n",
>> AcpiFormatException(status));
>> -        halt();
>> -    }
>> -
>> +    // Firecracker only supports this as away to shutdown the VM
>> +    // Reset using the 8042 PS/2 Controller ("keyboard controller")
>> +    processor::outb(0xfe, 0x64);
>>      // We shouldn't get here on x86.
>>      halt();
>>  }
>> diff --git a/arch/x64/smp.cc b/arch/x64/smp.cc
>> index 073ef206..4bdae0c6 100644
>> --- a/arch/x64/smp.cc
>> +++ b/arch/x64/smp.cc
>> @@ -74,7 +74,21 @@ void parse_madt()
>>
>>  void smp_init()
>>  {
>> -    parse_madt();
>> +    //Firecracker does not support ACPI so
>> +    //there is no MADT table
>> +    //parse_madt();
>> +
>> +    //TODO: This a nasty hack as we support
>> +    // single vCPU. Eventually we should parse out equivalent
>> information
>> +    // about all vCPUs from MP table (seems like more ancient way).
>> +    // See
>> https://github.com/firecracker-microvm/firecracker/blob/7f29bca9ca197283275eab62fddc1c10ab580794/x86_64/src/mptable.rs
>> +    auto c = new sched::cpu(0);
>> +    c->arch.apic_id = 0;//lapic->Id;
>> +    c->arch.acpi_id = 0;//lapic->ProcessorId;
>> +    c->arch.initstack.next = smp_stack_free;
>> +    smp_stack_free = &c->arch.initstack;
>> +    sched::cpus.push_back(c);
>> +
>>      sched::current_cpu = sched::cpus[0];
>>      for (auto c : sched::cpus) {
>>          c->incoming_wakeups =
>> aligned_array_new<sched::cpu::incoming_wakeup_queue>(sched::cpus.size());
>> diff --git a/drivers/acpi.cc b/drivers/acpi.cc
>> index 506ca68f..006231f1 100644
>> --- a/drivers/acpi.cc
>> +++ b/drivers/acpi.cc
>> @@ -605,7 +605,9 @@ void init()
>>
>>  }
>>
>> -void __attribute__((constructor(init_prio::acpi))) acpi_init_early()
>> -{
>> -     XENPV_ALTERNATIVE({ acpi::early_init(); }, {});
>> -}
>> +//TODO: Is there a way to detect if ACPI is available and do not even
>> +//try to load relevant information? Right now OSv depends on ACPI
>> available
>> +//void __attribute__((constructor(init_prio::acpi))) acpi_init_early()
>> +//{
>> +//     XENPV_ALTERNATIVE({ acpi::early_init(); }, {});
>> +//}
>> diff --git a/drivers/virtio-blk.cc b/drivers/virtio-blk.cc
>> index 0bc6fb0f..fa4cba25 100644
>> --- a/drivers/virtio-blk.cc
>> +++ b/drivers/virtio-blk.cc
>> @@ -115,7 +115,6 @@ bool blk::ack_irq()
>>  blk::blk(virtio_device& virtio_dev)
>>      : virtio_driver(virtio_dev), _ro(false)
>>  {
>> -
>>      _driver_name = "virtio-blk";
>>      _id = _instance++;
>>      virtio_i("VIRTIO BLK INSTANCE %d", _id);
>> @@ -144,6 +143,13 @@ blk::blk(virtio_device& virtio_dev)
>>              [=] { return this->ack_irq(); },
>>              [=] { t->wake(); });
>>      };
>> +
>> +    int_factory.create_gsi_edge_interrupt = [this,t]() {
>> +        return new gsi_edge_interrupt(
>> +                _dev.get_irq(),
>> +                [=] { if (this->ack_irq()) t->wake(); });
>> +    };
>> +
>>      _dev.register_interrupt(int_factory);
>>
>>      // Enable indirect descriptor
>> diff --git a/drivers/virtio-mmio.cc b/drivers/virtio-mmio.cc
>> new file mode 100644
>> index 00000000..27d71848
>> --- /dev/null
>> +++ b/drivers/virtio-mmio.cc
>> @@ -0,0 +1,131 @@
>> +//
>> +// Created by wkozaczuk on 12/25/18.
>> +//
>> +
>> +#include <osv/debug.hh>
>> +#include "virtio-mmio.hh"
>> +
>> +namespace virtio {
>> +
>> +// This implements virtio-io mmio device (transport layer, modeled after
>> PSI).
>> +// Read here -
>> https://www.kraxel.org/virtio/virtio-v1.0-cs03-virtio-gpu.html#x1-1080002
>> +hw_device_id mmio_device::get_id()
>> +{
>> +    return hw_device_id(0x1af4, _device_id);
>> +}
>> +
>> +u8 mmio_device::get_status() {
>> +    return mmio_getl(_addr_mmio + VIRTIO_MMIO_STATUS) & 0xff;
>> +}
>> +
>> +void mmio_device::set_status(u8 status) {
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_STATUS, status);
>> +}
>> +
>> +u64 mmio_device::get_available_features() {
>> +    u64 features;
>> +
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_DEVICE_FEATURES_SEL, 1);
>> +    features = mmio_getl(_addr_mmio + VIRTIO_MMIO_DEVICE_FEATURES);
>> +    features <<= 32;
>> +
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_DEVICE_FEATURES_SEL, 0);
>> +    features |= mmio_getl(_addr_mmio + VIRTIO_MMIO_DEVICE_FEATURES);
>> +
>> +    return features;
>> +}
>> +
>> +void mmio_device::set_enabled_features(u64 features) {
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_DRIVER_FEATURES_SEL, 1);
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_DRIVER_FEATURES, (u32)(features
>> >> 32));
>> +
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_DRIVER_FEATURES_SEL, 0);
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_DRIVER_FEATURES, (u32)features);
>> +}
>> +
>> +void mmio_device::kick_queue(int queue_num) {
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_QUEUE_NOTIFY, queue_num);
>> +}
>> +
>> +void mmio_device::select_queue(int queue_num) {
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_QUEUE_SEL, queue_num);
>> +    assert(!mmio_getl(_addr_mmio + VIRTIO_MMIO_QUEUE_READY));
>> +}
>> +
>> +u16 mmio_device::get_queue_size() {
>> +    return mmio_getl(_addr_mmio + VIRTIO_MMIO_QUEUE_NUM_MAX) & 0xffff;
>> +}
>> +
>> +void mmio_device::setup_queue(vring* queue) {
>> +    // Set size
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_QUEUE_NUM, queue->size());
>> +    //
>> +    // Pass addresses
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_QUEUE_DESC_LOW,
>> (u32)queue->get_desc_addr());
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_QUEUE_DESC_HIGH,
>> (u32)(queue->get_desc_addr() >> 32));
>> +
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_QUEUE_AVAIL_LOW,
>> (u32)queue->get_avail_addr());
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_QUEUE_AVAIL_HIGH,
>> (u32)(queue->get_avail_addr() >> 32));
>> +
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_QUEUE_USED_LOW,
>> (u32)queue->get_used_addr());
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_QUEUE_USED_HIGH,
>> (u32)(queue->get_used_addr() >> 32));
>> +}
>> +
>> +void mmio_device::activate_queue(int queue)
>> +{
>> +    //
>> +    // Make it ready
>> +    select_queue(queue);
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_QUEUE_READY, 1 );
>> +}
>> +
>> +u8 mmio_device::read_and_ack_isr() {
>> +    //TODO: we might need to guard this read and write with a mutex
>> +    //to prevent two concurrent interrupts raised against same device
>> step
>> +    //on each other. Is it possible? Spec does not seem to say anything
>> about it.
>> +    unsigned long status = mmio_getl(_addr_mmio +
>> VIRTIO_MMIO_INTERRUPT_STATUS);
>> +    //Sometimes this assert would be false (maybe of what I am talking
>> above).
>> +    //assert(status & VIRTIO_MMIO_INT_VRING);
>> +    mmio_setl(_addr_mmio + VIRTIO_MMIO_INTERRUPT_ACK, status);
>> +    //return true;
>> +    return (status & VIRTIO_MMIO_INT_VRING);
>> +}
>> +
>> +u8 mmio_device::read_config(u32 offset) {
>> +    return mmio_getb(_addr_mmio + VIRTIO_MMIO_CONFIG + offset);
>> +}
>> +
>> +void mmio_device::register_interrupt(interrupt_factory irq_factory)
>> +{
>> +    _irq.reset(irq_factory.create_gsi_edge_interrupt());
>> +}
>> +
>> +bool mmio_device::parse_config() {
>> +    _addr_mmio = mmio_map(_address, _size);
>> +
>> +    u32 magic = mmio_getl(_addr_mmio + VIRTIO_MMIO_MAGIC_VALUE);
>> +    if (magic != ('v' | 'i' << 8 | 'r' << 16 | 't' << 24)) {
>> +        return false;
>> +    }
>> +
>> +    // Check device version
>> +    u32 version = mmio_getl(_addr_mmio + VIRTIO_MMIO_VERSION);
>> +    if (version < 1 || version > 2) {
>> +        debugf( "Version %ld not supported!\n", version);
>> +        return false;
>> +    }
>> +
>> +    _device_id = mmio_getl(_addr_mmio + VIRTIO_MMIO_DEVICE_ID);
>> +    if (_device_id == 0) {
>> +        //
>> +        // virtio-mmio device with an ID 0 is a (dummy) placeholder
>> +        // with no function. End probing now with no error reported.
>> +        debug( "Dummy virtio-mmio device detected!\n");
>> +        return false;
>> +    }
>> +    _vendor_id = mmio_getl(_addr_mmio + VIRTIO_MMIO_VENDOR_ID);
>> +
>> +    debugf("Detected virtio-mmio device: (%ld,%ld)\n", _device_id,
>> _vendor_id);
>> +    return true;
>> +}
>> +}
>> diff --git a/drivers/virtio-mmio.hh b/drivers/virtio-mmio.hh
>> new file mode 100644
>> index 00000000..25c88f6a
>> --- /dev/null
>> +++ b/drivers/virtio-mmio.hh
>> @@ -0,0 +1,148 @@
>> +//
>> +// Created by wkozaczuk on 12/25/18.
>> +//
>> +
>> +#ifndef VIRTIO_MMIO_DEVICE_HH
>> +#define VIRTIO_MMIO_DEVICE_HH
>> +
>> +#include <osv/types.h>
>> +#include <osv/mmio.hh>
>> +#include "virtio-device.hh"
>> +#include "virtio-vring.hh"
>> +
>> +using namespace hw;
>> +
>> +/* Magic value ("virt" string) - Read Only */
>> +#define VIRTIO_MMIO_MAGIC_VALUE                0x000
>> +
>> +/* Virtio device version - Read Only */
>> +#define VIRTIO_MMIO_VERSION                0x004
>> +
>> +/* Virtio device ID - Read Only */
>> +#define VIRTIO_MMIO_DEVICE_ID                0x008
>> +
>> +/* Virtio vendor ID - Read Only */
>> +#define VIRTIO_MMIO_VENDOR_ID                0x00c
>> +
>> +/* Bitmask of the features supported by the device (host)
>> + * (32 bits per set) - Read Only */
>> +#define VIRTIO_MMIO_DEVICE_FEATURES        0x010
>> +
>> +/* Device (host) features set selector - Write Only */
>> +#define VIRTIO_MMIO_DEVICE_FEATURES_SEL        0x014
>> +
>> +/* Bitmask of features activated by the driver (guest)
>> + * (32 bits per set) - Write Only */
>> +#define VIRTIO_MMIO_DRIVER_FEATURES        0x020
>> +
>> +/* Activated features set selector - Write Only */
>> +#define VIRTIO_MMIO_DRIVER_FEATURES_SEL        0x024
>> +
>> +/* Queue selector - Write Only */
>> +#define VIRTIO_MMIO_QUEUE_SEL                0x030
>> +
>> +/* Maximum size of the currently selected queue - Read Only */
>> +#define VIRTIO_MMIO_QUEUE_NUM_MAX        0x034
>> +
>> +/* Queue size for the currently selected queue - Write Only */
>> +#define VIRTIO_MMIO_QUEUE_NUM                0x038
>> +
>> +/* Ready bit for the currently selected queue - Read Write */
>> +#define VIRTIO_MMIO_QUEUE_READY                0x044
>> +
>> +/* Queue notifier - Write Only */
>> +#define VIRTIO_MMIO_QUEUE_NOTIFY        0x050
>> +
>> +/* Interrupt status - Read Only */
>> +#define VIRTIO_MMIO_INTERRUPT_STATUS        0x060
>> +
>> +/* Interrupt acknowledge - Write Only */
>> +#define VIRTIO_MMIO_INTERRUPT_ACK        0x064
>> +
>> +/* Device status register - Read Write */
>> +#define VIRTIO_MMIO_STATUS                0x070
>> +
>> +/* Selected queue's Descriptor Table address, 64 bits in two halves */
>> +#define VIRTIO_MMIO_QUEUE_DESC_LOW        0x080
>> +#define VIRTIO_MMIO_QUEUE_DESC_HIGH        0x084
>> +
>> +/* Selected queue's Available Ring address, 64 bits in two halves */
>> +#define VIRTIO_MMIO_QUEUE_AVAIL_LOW        0x090
>> +#define VIRTIO_MMIO_QUEUE_AVAIL_HIGH        0x094
>> +
>> +/* Selected queue's Used Ring address, 64 bits in two halves */
>> +#define VIRTIO_MMIO_QUEUE_USED_LOW        0x0a0
>> +#define VIRTIO_MMIO_QUEUE_USED_HIGH        0x0a4
>> +
>> +/* Configuration atomicity value */
>> +#define VIRTIO_MMIO_CONFIG_GENERATION        0x0fc
>> +
>> +/* The config space is defined by each driver as
>> + * the per-driver configuration space - Read Write */
>> +#define VIRTIO_MMIO_CONFIG                0x100
>> +
>> +#define VIRTIO_MMIO_INT_VRING                (1 << 0)
>> +#define VIRTIO_MMIO_INT_CONFIG                (1 << 1)
>> +
>> +namespace virtio {
>> +
>> +class mmio_device : public virtio_device {
>> +public:
>> +    mmio_device(u64 address, u64 size, unsigned int irq) :
>> +        _address(address), _size(size), _irq_no(irq),
>> +        _vendor_id(0), _device_id(0), _addr_mmio(0) {}
>> +
>> +    virtual ~mmio_device() {}
>> +
>> +    virtual hw_device_id get_id();
>> +
>> +    virtual void init() {}
>> +    virtual void print() {}
>> +    virtual void reset() {}
>> +
>> +    virtual unsigned get_irq() { return _irq_no; }
>> +    virtual u8 read_and_ack_isr();
>> +    virtual void register_interrupt(interrupt_factory irq_factory);
>> +
>> +    virtual void select_queue(int queue);
>> +    virtual u16 get_queue_size();
>> +    virtual void setup_queue(vring *queue);
>> +    virtual void activate_queue(int queue);
>> +    virtual void kick_queue(int queue);
>> +
>> +    virtual u64 get_available_features();
>> +    virtual bool get_available_feature_bit(int bit) { return false; }
>> //TODO: Should be implemented ?
>> +
>> +    virtual void set_enabled_features(u64 features);
>> +    virtual u64 get_enabled_features() { return 0; } //TODO: Should be
>> implemented - function calling it in virtio.cc is not used?
>> +    virtual bool get_enabled_feature_bit(int bit) { return false; }
>> //TODO: Should be implemented - function calling it in virtio.cc is not
>> used?
>> +
>> +    virtual u8 get_status();
>> +    virtual void set_status(u8 status);
>> +
>> +    virtual u8 read_config(u32 offset);
>> +    virtual void write_config(u32 offset, u8 byte) {} //TODO: Implement
>> - function
>> +    // calling it (virtio_driver::virtio_conf_write) is not used by
>> anything?
>> +
>> +    virtual void dump_config() {} //TODO: Implement it
>> +
>> +    virtual bool is_modern() { return true; };
>> +    virtual size_t get_vring_alignment() { return 4096;
>> }//VIRTIO_PCI_VRING_ALIGN; };
>> +
>> +    bool parse_config();
>> +
>> +private:
>> +    u64 _address;
>> +    u64 _size;
>> +    unsigned int _irq_no;
>> +    //u64 _id;
>> +    u16 _vendor_id;
>> +    u16 _device_id;
>> +
>> +    mmioaddr_t _addr_mmio;
>> +    std::unique_ptr<gsi_edge_interrupt> _irq;
>> +};
>> +
>> +}
>> +
>> +#endif //VIRTIO_MMIO_DEVICE_HH
>> diff --git a/drivers/virtio-net.cc b/drivers/virtio-net.cc
>> index ef2dff66..3522a80d 100644
>> --- a/drivers/virtio-net.cc
>> +++ b/drivers/virtio-net.cc
>> @@ -316,6 +316,13 @@ net::net(virtio_device& dev)
>>              [=] { return this->ack_irq(); },
>>              [=] { poll_task->wake(); });
>>      };
>> +
>> +    int_factory.create_gsi_edge_interrupt = [this,poll_task]() {
>> +        return new gsi_edge_interrupt(
>> +            _dev.get_irq(),
>> +            [=] { if (this->ack_irq()) poll_task->wake(); });
>> +    };
>> +
>>      _dev.register_interrupt(int_factory);
>>
>>      fill_rx_ring();
>> diff --git a/drivers/virtio.cc b/drivers/virtio.cc
>> index 5ef0f94d..9b778522 100644
>> --- a/drivers/virtio.cc
>> +++ b/drivers/virtio.cc
>> @@ -180,11 +180,16 @@ bool virtio_driver::get_device_feature_bit(int bit)
>>  void virtio_driver::set_guest_features(u64 features)
>>  {
>>      _dev.set_enabled_features(features);
>> +    _enabled_features = features;
>>  }
>>
>>  bool virtio_driver::get_guest_feature_bit(int bit)
>>  {
>> -    return _dev.get_enabled_feature_bit(bit);
>> +    if (_dev.is_modern()) //TODO: SO far only mmio - will work with
>> modern pci?
>> +        return (_enabled_features & (1 << bit)) != 0;
>> +    else
>> +        return _dev.get_enabled_feature_bit(bit);
>> +    //return _dev.get_enabled_feature_bit(bit);
>>  }
>>
>>  u8 virtio_driver::get_dev_status()
>> diff --git a/drivers/virtio.hh b/drivers/virtio.hh
>> index ef915e1c..4b59a91a 100644
>> --- a/drivers/virtio.hh
>> +++ b/drivers/virtio.hh
>> @@ -114,6 +114,7 @@ protected:
>>      bool _cap_indirect_buf;
>>      bool _cap_event_idx = false;
>>      static int _disk_idx;
>> +    u64 _enabled_features;
>>  };
>>
>>  template <typename T, u16 ID>
>> diff --git a/loader.cc b/loader.cc
>> index c29bf4c4..530e3816 100644
>> --- a/loader.cc
>> +++ b/loader.cc
>> @@ -55,6 +55,8 @@
>>  #include "drivers/null.hh"
>>
>>  #include "libc/network/__dns.hh"
>> +#include "early-console.hh"
>> +#include <processor.hh>
>>
>>  using namespace osv;
>>  using namespace osv::clock::literals;
>> @@ -426,6 +428,7 @@ void* do_main_thread(void *_main_args)
>>      }
>>
>>      boot_time.event("Total time");
>> +    processor::outb(123, 0x3f0);
>>
>>      if (opt_bootchart) {
>>          boot_time.print_chart();
>> @@ -545,7 +548,7 @@ void main_cont(int loader_argc, char** loader_argv)
>>      memory::enable_debug_allocator();
>>
>>  #ifndef AARCH64_PORT_STUB
>> -    acpi::init();
>> +    //acpi::init();
>>  #endif /* !AARCH64_PORT_STUB */
>>
>>      if (sched::cpus.size() > sched::max_cpus) {
>> --
>> 2.19.1
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to