On Wednesday, January 16, 2019 at 8:58:46 AM UTC-5, Nadav Har'El wrote: > > > On Wed, Jan 16, 2019 at 7:21 AM Waldemar Kozaczuk <[email protected] > <javascript:>> wrote: > > This patch refactors virtio layer in order to support > modern PCI and mmio devices. It is essentially a step 0 > in the list of tasks to make OSv boot on AWS firecracker. > > > This patch introduces virtio_device class that abstracts > virtio transport layer. That way virtio_driver class and > its specializations can interact with host without being > tied to a specific transport (PCI, mmio, etc). > > The virtio_pci_device, virtio_legacy_pci_device and > virtio_modern_pci_device > classes define specializations of virtio_device for PCI transport. > This patch effectively moves all PCI-related logic from virtio_driver > to virtio_legacy_pci_device classi and effectively virtio drivers > transport agnostic. Followup patches will provide implementations > of virtio_modern_pci_device and future virtio_mmio_device. > > > The interrupt_factory struct defines a mechanism used > by virtio drivers to specify particulars of creating/registering > proper interrupt logic for each transport. > > > Thanks! > > This is a long patch which looks mostly good to me, but it has a lot of > things so I left a few comments below but I can't really tell if all > the details are correct or not, so I would appreciate knowing more on how > you tested it. In particular you should see that in the > normal case ("legacy" virtio in qemu) the network, disk, and random number > generator drivers continue to work as expected. > Would be nice to also get to other edge cases (various old interrupt > types) but I'm not even sure how we ever reached those. > Yeah I am not sure how to make it shorter. I simply ran all the unit tests many times and they pass. I also tested couple of apps with virtio-scsi on. I guess I could run all unit tests with them as well. Same for older interrupt types - would removing "+x2apic" make it use old interrupt. I will also test non-virtio devices just in case. Overall this patch should not change any existing functionality. I must say that I never saw any failures on my Linux laptop but I did see occasional failures of some networking unit tests when running in nested virtualization - KVM off. But I think unit tests would occasionally fail in nested virtualization scenario before my patch as well. But I will double verify it.
Also this patch should not affect XEN or Vmware, should it? > > > Signed-off-by: Waldemar Kozaczuk <[email protected] <javascript:>> > --- > Makefile | 1 + > drivers/device.hh | 2 + > drivers/pci-generic.cc | 25 +++- > drivers/virtio-assign.cc | 40 ++----- > drivers/virtio-blk.cc | 29 +++-- > drivers/virtio-blk.hh | 5 +- > drivers/virtio-device.hh | 78 ++++++++++++ > drivers/virtio-net.cc | 38 +++--- > drivers/virtio-net.hh | 4 +- > drivers/virtio-pci-device.cc | 223 +++++++++++++++++++++++++++++++++++ > drivers/virtio-pci-device.hh | 150 +++++++++++++++++++++++ > drivers/virtio-rng.cc | 16 ++- > drivers/virtio-rng.hh | 3 +- > drivers/virtio-scsi.cc | 32 ++--- > drivers/virtio-scsi.hh | 5 +- > drivers/virtio-vring.cc | 15 +-- > drivers/virtio-vring.hh | 4 +- > drivers/virtio.cc | 120 +++++-------------- > drivers/virtio.hh | 87 ++------------ > drivers/vmw-pvscsi.hh | 4 +- > 20 files changed, 611 insertions(+), 270 deletions(-) > create mode 100644 drivers/virtio-device.hh > create mode 100644 drivers/virtio-pci-device.cc > create mode 100644 drivers/virtio-pci-device.hh > > diff --git a/Makefile b/Makefile > index 68043485..7a88c56a 100644 > --- a/Makefile > +++ b/Makefile > @@ -819,6 +819,7 @@ drivers += $(libtsm) > drivers += drivers/vga.o drivers/kbd.o drivers/isa-serial.o > drivers += arch/$(arch)/pvclock-abi.o > drivers += drivers/virtio.o > +drivers += drivers/virtio-pci-device.o > drivers += drivers/virtio-vring.o > drivers += drivers/virtio-net.o > drivers += drivers/virtio-assign.o > diff --git a/drivers/device.hh b/drivers/device.hh > index 3882470a..2fbd0afd 100644 > --- a/drivers/device.hh > +++ b/drivers/device.hh > @@ -34,6 +34,8 @@ namespace hw { > && _device_id == other._device_id; > } > > + bool is_vendor(u16 vendor_id) { return _vendor_id == vendor_id; } > + > private: > u16 _vendor_id; > u16 _device_id; > diff --git a/drivers/pci-generic.cc b/drivers/pci-generic.cc > index 61e7a0f5..821111c3 100644 > --- a/drivers/pci-generic.cc > +++ b/drivers/pci-generic.cc > @@ -16,6 +16,8 @@ > #include "drivers/pci-function.hh" > #include "drivers/pci-bridge.hh" > #include "drivers/pci-device.hh" > +#include "drivers/virtio.hh" > +#include "drivers/virtio-pci-device.hh" > > namespace pci { > > @@ -93,11 +95,24 @@ bool check_bus(u16 bus) > break; > } > > - if (!device_manager::instance()->register_device(dev)) { > - pci_e("Error: couldn't register device %02x:%02x.%x", > - bus, slot, func); > - //TODO: Need to beautify it as multiple instances of the > device may exist > - delete dev; > + if (auto pci_dev = dynamic_cast<device*>(dev)) { > > > This test is just for checking dev is not a bridge? Or something else? > Yep. > > > + if > (pci_dev->get_id().is_vendor(virtio::VIRTIO_VENDOR_ID)) { > + auto virtio_dev = > virtio::create_virtio_device(pci_dev); > + if > (!device_manager::instance()->register_device(virtio_dev)) { > + pci_e("Error: couldn't register virtio pci device > %02x:%02x.%x", > + bus, slot, func); > + //TODO: Need to beautify it as multiple instances > of the device may exist > + delete dev; > + } > + } > > > Shouldn't we have an else here? Perhaps reporting an unknown device, or > soemthing? > Sure. > > + } > + else { > > + if (!device_manager::instance()->register_device(dev)) { > + pci_e("Error: couldn't register device %02x:%02x.%x", > + bus, slot, func); > + //TODO: Need to beautify it as multiple instances of > the device may exist > + delete dev; > + } > > > This code code (register_device() call, error handling, etc.) is > duplicated. Would be nice not to, I think (perhaps by making this regular > code, not an "else", and having the other successful cases fall through to > this registration). > > Sure. > } > > // test for multiple functions > diff --git a/drivers/virtio-assign.cc b/drivers/virtio-assign.cc > index 8a128d5e..427811b6 100644 > --- a/drivers/virtio-assign.cc > +++ b/drivers/virtio-assign.cc > @@ -7,8 +7,10 @@ > > > #include <osv/virtio-assign.hh> > +#include <drivers/virtio-device.hh> > #include <drivers/virtio-net.hh> > > + > // Currently we support only one assigned virtio device, but more could > // easily be added later. > static osv::assigned_virtio *the_assigned_virtio_device = nullptr; > @@ -23,8 +25,8 @@ assigned_virtio *assigned_virtio::get() { > class impl : public osv::assigned_virtio, public virtio::virtio_driver { > public: > static hw_driver* probe_net(hw_device* dev); > - explicit impl(pci::device& dev) > - : virtio_driver(dev) > + explicit impl(virtio::virtio_device& virtio_dev) > + : virtio_driver(virtio_dev) > { > assert(!the_assigned_virtio_device); > the_assigned_virtio_device = this; > @@ -48,9 +50,8 @@ public: > > virtual u32 queue_size(int queue) override > { > - virtio_conf_writew(virtio::VIRTIO_PCI_QUEUE_SEL, queue); > - return virtio_conf_readw(virtio::VIRTIO_PCI_QUEUE_NUM); > - > + _dev.select_queue(queue); > + return _dev.get_queue_size(); > } > virtual u32 init_features(u32 driver_features) override > { > @@ -70,34 +71,13 @@ public: > auto *ht = &_hack_threads.back(); // assumes object won't move > later > handler = [ht] { ht->wake(); }; > > - // OSv's generic virtio driver has already set the device to > msix, and set > - // the VIRTIO_MSI_QUEUE_VECTOR of its queue to its number. > - assert(_dev.is_msix()); > - virtio_conf_writew(virtio::VIRTIO_PCI_QUEUE_SEL, queue); > - assert(virtio_conf_readw(virtio::VIRTIO_MSI_QUEUE_VECTOR) == > queue); > - if (!_dev.is_msix_enabled()) { > - _dev.msix_enable(); > - } > - auto vectors = _msi.request_vectors(1); > - assert(vectors.size() == 1); > - auto vec = vectors[0]; > - // TODO: in _msi.easy_register() we also have code for moving the > - // interrupt's affinity to where the handling thread is. We should > - // probably do this here too. > - _msi.assign_isr(vec, handler); > - auto ok = _msi.setup_entry(queue, vec); > - assert(ok); > - vec->msix_unmask_entries(); > + _dev.register_interrupt(queue,handler); > > > Being very rusty with this code, I am not sure if these changes make sense > or not. Would have been > really great if you could test whether "assigned virtio" still works. But > this will probably be difficult to > test because assigned virtio (see > 86748d776b5fd7bcbb0af61da22ef148d40c66f0) is a feature we > invented to tell OSv to ignore a virtio-net device and instead let the > application (namely, an old version > of Seastar) read from the virtio queue directly. > > If we can't test it still works, I wonder if we should not, with deep > regrets, drop the "assigned virtio" feature :-( > I am very much in favor of you suggestion. Honestly I have no idea if my changes do not break this functionality. I was not even aware that it used to be used by seastar. Unless somebody strongly disagrees I am more than happy to send a separate patch removing assigned virtio. This would also make this patch smaller and somewhat simpler. Anybody strongly disagrees? Shall I send a separate email to the group? > > > } > > virtual void set_queue_pfn(int queue, u64 phys) override > { > - virtio_conf_writew(virtio::VIRTIO_PCI_QUEUE_SEL, queue); > - // Tell host about pfn > - u64 pfn = phys >> virtio::VIRTIO_PCI_QUEUE_ADDR_SHIFT; > - // A bug in virtio's design... on large memory, this can actually > happen > - assert(pfn <= std::numeric_limits<u32>::max()); > - virtio_conf_writel(virtio::VIRTIO_PCI_QUEUE_PFN, (u32)pfn); > + _dev.select_queue(queue); > + _dev.activate_queue(phys); > } > > virtual void set_driver_ok() override > @@ -107,7 +87,7 @@ public: > > virtual void conf_read(void *buf, int length) override > { > - virtio_conf_read(virtio_pci_config_offset(), buf, length); > + virtio_conf_read(_dev.config_offset(), buf, length); > } > > private: > diff --git a/drivers/virtio-blk.cc b/drivers/virtio-blk.cc > index a3f7898d..5f9f2028 100644 > --- a/drivers/virtio-blk.cc > +++ b/drivers/virtio-blk.cc > @@ -10,7 +10,6 @@ > > #include "drivers/virtio.hh" > #include "drivers/virtio-blk.hh" > -#include "drivers/pci-device.hh" > #include <osv/interrupt.hh> > > #include <osv/mempool.hh> > @@ -100,7 +99,7 @@ struct driver blk_driver = { > > bool blk::ack_irq() > { > - auto isr = virtio_conf_readb(VIRTIO_PCI_ISR); > + auto isr = _dev.ack_irq(); > > > Why do you call the method returning the isr "ack_irq"? > (or, alternatively, why call its return value "isr"?) > I will revisit it. > > > auto queue = get_virt_queue(0); > > if (isr) { > @@ -112,8 +111,8 @@ bool blk::ack_irq() > > } > > -blk::blk(pci::device& pci_dev) > - : virtio_driver(pci_dev), _ro(false) > +blk::blk(virtio_device& virtio_dev) > + : virtio_driver(virtio_dev), _ro(false) > { > > _driver_name = "virtio-blk"; > @@ -128,13 +127,19 @@ blk::blk(pci::device& pci_dev) > sched::thread::attr().name("virtio-blk")); > t->start(); > auto queue = get_virt_queue(0); > - if (pci_dev.is_msix()) { > - _msi.easy_register({ { 0, [=] { queue->disable_interrupts(); }, t > } }); > - } else { > - _irq.reset(new pci_interrupt(pci_dev, > - [=] { return ack_irq(); }, > - [=] { t->wake(); })); > - } > + > + interrupt_factory int_factory; > + int_factory.register_msi_bindings = [queue, t](interrupt_manager > &msi) { > + msi.easy_register( {{ 0, [=] { queue->disable_interrupts(); }, t > }}); > + }; > + > + int_factory.create_pci_interrupt = [this,t](pci::device &pci_dev) { > + return new pci_interrupt( > + pci_dev, > + [=] { return this->ack_irq(); }, > + [=] { t->wake(); }); > + }; > + _dev.register_interrupt(int_factory); > > // Enable indirect descriptor > queue->set_use_indirect(true); > @@ -164,7 +169,7 @@ blk::~blk() > void blk::read_config() > { > //read all of the block config (including size, mce, topology,..) in > one shot > - virtio_conf_read(virtio_pci_config_offset(), &_config, > sizeof(_config)); > + virtio_conf_read(_dev.config_offset(), &_config, sizeof(_config)); > > trace_virtio_blk_read_config_capacity(_config.capacity); > > diff --git a/drivers/virtio-blk.hh b/drivers/virtio-blk.hh > index 17cf4d18..8e814501 100644 > --- a/drivers/virtio-blk.hh > +++ b/drivers/virtio-blk.hh > @@ -8,7 +8,7 @@ > #ifndef VIRTIO_BLK_DRIVER_H > #define VIRTIO_BLK_DRIVER_H > #include "drivers/virtio.hh" > -#include "drivers/pci-device.hh" > +#include "drivers/virtio-device.hh" > #include <osv/bio.h> > > namespace virtio { > @@ -118,7 +118,7 @@ public: > u8 status; > }; > > - explicit blk(pci::device& dev); > + explicit blk(virtio_device& dev); > virtual ~blk(); > > virtual std::string get_name() const { return _driver_name; } > @@ -157,7 +157,6 @@ private: > bool _ro; > // This mutex protects parallel make_request invocations > mutex _lock; > - std::unique_ptr<pci_interrupt> _irq; > }; > > } > diff --git a/drivers/virtio-device.hh b/drivers/virtio-device.hh > new file mode 100644 > index 00000000..e6e29155 > --- /dev/null > +++ b/drivers/virtio-device.hh > @@ -0,0 +1,78 @@ > +/* > + * Copyright (C) 2019 Waldemar Kozaczuk. > + * > + * This work is open source software, licensed under the terms of the > + * BSD license as described in the LICENSE file in the top-level > directory. > + */ > + > +#ifndef VIRTIO_DEVICE_HH > +#define VIRTIO_DEVICE_HH > + > +#include <osv/types.h> > +#include <osv/pci.hh> > +#include <osv/interrupt.hh> > +#include <osv/msi.hh> > +#include "virtio-vring.hh" > +#include "device.hh" > + > +using namespace hw; > + > +#define VIRTIO_ALIGN(x,alignment) ((x + (alignment-1)) & ~(alignment-1)) > + > +namespace virtio { > + > +// Allows virtio drivers specify how to instantiate interrupts or > +// register msi bindings. The associated virtio-device will > +// use adequate functor to create correct interrupt in order > +// to register it. > +struct interrupt_factory { > + std::function<void(interrupt_manager &)> register_msi_bindings = > nullptr; > + std::function<pci_interrupt *(pci::device &)> create_pci_interrupt = > nullptr; > + std::function<gsi_edge_interrupt *()> create_gsi_edge_interrupt = > nullptr; > +}; > + > +// Defines virtio transport abstraction used by virtio-driver > +// to communicate with virtio device. The specializations of thise > > > thise -> this > > +// include virtio pci device (legacy and modern) as well > +// as virtio mmio device. This abstraction allows virtio driver > +// not be tied to any specific transport (pci, mmio). > +class virtio_device : public hw_device { > +public: > + virtual ~virtio_device() {}; > + > + virtual void init() = 0; > + > + virtual u8 ack_irq() = 0; > + virtual void register_interrupt(interrupt_factory irq_factory) = 0; > + virtual void register_interrupt(unsigned int queue, > std::function<void(void)> handler) = 0; > + > + virtual void select_queue(int queue) = 0; > + virtual u16 get_queue_size() = 0; > + virtual void setup_queue(int queue) = 0; > + virtual void activate_queue(vring *queue) = 0; > + virtual void activate_queue(u64 phys) = 0; > + virtual void kick_queue(int queue) = 0; > + > + virtual u64 get_available_features() = 0; > + virtual bool get_available_feature_bit(int bit) = 0; > + > + virtual void set_enabled_features(u64 features) = 0; > + virtual void set_enabled_feature_bit(int bit, bool on) = 0; > + virtual u64 get_enabled_features() = 0; > + virtual bool get_enabled_feature_bit(int bit) = 0; > + > + virtual u8 get_status() = 0; > + virtual void set_status(u8 status) = 0; > + > + virtual u8 read_config(u32 offset) = 0; > + virtual void write_config(u32 offset, u8 byte) = 0; > + virtual int config_offset() = 0; > + virtual void dump_config() = 0; > + > + virtual bool is_modern() = 0; > + virtual size_t get_vring_alignment() = 0; > +}; > + > +} > + > +#endif //VIRTIO_DEVICE_HH > diff --git a/drivers/virtio-net.cc b/drivers/virtio-net.cc > index b820c36c..3528613d 100644 > --- a/drivers/virtio-net.cc > +++ b/drivers/virtio-net.cc > @@ -10,7 +10,6 @@ > > #include "drivers/virtio.hh" > #include "drivers/virtio-net.hh" > -#include "drivers/pci-device.hh" > #include <osv/interrupt.hh> > > #include <osv/mempool.hh> > @@ -217,7 +216,7 @@ void net::fill_qstats(const struct txq& txq, struct > if_data* out_data) const > > bool net::ack_irq() > { > - auto isr = virtio_conf_readb(VIRTIO_PCI_ISR); > + auto isr = _dev.ack_irq(); > > if (isr) { > _rxq.vqueue->disable_interrupts(); > @@ -228,7 +227,7 @@ bool net::ack_irq() > > } > > -net::net(pci::device& dev) > +net::net(virtio_device& dev) > : virtio_driver(dev), > _rxq(get_virt_queue(0), [this] { this->receiver(); }), > _txq(this, get_virt_queue(1)) > @@ -290,16 +289,21 @@ net::net(pci::device& dev) > > ether_ifattach(_ifn, _config.mac); > > - if (dev.is_msix()) { > - _msi.easy_register({ > - { 0, [&] { _rxq.vqueue->disable_interrupts(); }, poll_task }, > - { 1, [&] { _txq.vqueue->disable_interrupts(); }, nullptr } > - }); > - } else { > - _irq.reset(new pci_interrupt(dev, > - [=] { return this->ack_irq(); }, > - [=] { poll_task->wake(); })); > - } > + interrupt_factory int_factory; > + int_factory.register_msi_bindings = > [this,poll_task](interrupt_manager &msi) { > + msi.easy_register({ > + { 0, [&] { this->_rxq.vqueue->disable_interrupts(); }, > poll_task }, > + { 1, [&] { this->_txq.vqueue->disable_interrupts(); }, nullptr > } > + }); > + }; > + > + int_factory.create_pci_interrupt = [this,poll_task](pci::device > &pci_dev) { > + return new pci_interrupt( > + pci_dev, > + [=] { return this->ack_irq(); }, > + [=] { poll_task->wake(); }); > + }; > + _dev.register_interrupt(int_factory); > > I am using here nested lambdas (lambdas that pass lambdas). My C++ experience is before C++11 so I wonder if there is no benign error especially around capturing by value vs reference. For example the outer lambda captures "this" and "poll_task" pointers by value - are we sure it does not affect the nested lambdas in some unexpected way? Also I am wondering what you think about this interrupt_factory struct as a way to register/create interrupts specific to each transport? Do you think there is a better way of doing it? > fill_rx_ring(); > > @@ -324,7 +328,7 @@ net::~net() > void net::read_config() > { > //read all of the net config in one shot > - virtio_conf_read(virtio_pci_config_offset(), &_config, > sizeof(_config)); > + virtio_conf_read(_dev.config_offset(), &_config, sizeof(_config)); > > if (get_guest_feature_bit(VIRTIO_NET_F_MAC)) > net_i("The mac addr of the device is %x:%x:%x:%x:%x:%x", > @@ -856,12 +860,12 @@ u32 net::get_driver_features() > > hw_driver* net::probe(hw_device* dev) > { > - if (auto pci_dev = dynamic_cast<pci::device*>(dev)) { > - if (pci_dev->get_id() == hw_device_id(VIRTIO_VENDOR_ID, > VIRTIO_NET_DEVICE_ID)) { > + if (auto virtio_dev = dynamic_cast<virtio_device*>(dev)) { > + if (virtio_dev->get_id() == hw_device_id(VIRTIO_VENDOR_ID, > VIRTIO_NET_DEVICE_ID)) { > if (opt_maxnic && maxnic-- <= 0) { > return nullptr; > } else { > - return aligned_new<net>(*pci_dev); > + return aligned_new<net>(*virtio_dev); > } > } > } > diff --git a/drivers/virtio-net.hh b/drivers/virtio-net.hh > index ad323e69..e9f6dd0d 100644 > --- a/drivers/virtio-net.hh > +++ b/drivers/virtio-net.hh > @@ -204,7 +204,7 @@ public: > u16 virtqueue_pairs; > }; > > - explicit net(pci::device& dev); > + explicit net(virtio_device& dev); > virtual ~net(); > > virtual std::string get_name() const { return _driver_name; } > @@ -269,8 +269,6 @@ private: > > u32 _hdr_size; > > - std::unique_ptr<pci_interrupt> _irq; > - > struct rxq_stats { > u64 rx_packets; /* if_ipackets */ > u64 rx_bytes; /* if_ibytes */ > diff --git a/drivers/virtio-pci-device.cc b/drivers/virtio-pci-device.cc > new file mode 100644 > index 00000000..a3ccfe26 > --- /dev/null > +++ b/drivers/virtio-pci-device.cc > @@ -0,0 +1,223 @@ > +/* > + * Copyright (C) 2019 Waldemar Kozaczuk. > + * > + * This work is open source software, licensed under the terms of the > + * BSD license as described in the LICENSE file in the top-level > directory. > + */ > + > +#include "drivers/virtio-pci-device.hh" > + > +namespace virtio { > + > +virtio_pci_device::virtio_pci_device(pci::device *dev) > + : virtio_device() > + , _dev(dev) > + , _msi(dev) > +{ > +} > + > +virtio_pci_device::~virtio_pci_device() { > + delete _dev; > +} > + > +virtio_legacy_pci_device::virtio_legacy_pci_device(pci::device *dev) > + : virtio_pci_device(dev) > +{ > +} > + > +void virtio_legacy_pci_device::kick_queue(int queue) > +{ > + virtio_conf_writew(VIRTIO_PCI_QUEUE_NOTIFY, queue); > +} > + > +void virtio_legacy_pci_device::setup_queue(int queue) > +{ > + if (_dev->is_msix()) { > + // Setup queue_id:entry_id 1:1 correlation... > + virtio_conf_writew(VIRTIO_MSI_QUEUE_VECTOR, queue); > + if (virtio_conf_readw(VIRTIO_MSI_QUEUE_VECTOR) != queue) { > + virtio_e("Setting MSIx entry for queue %d failed.", queue); > + return; > + } > + } > +} > + > +void virtio_legacy_pci_device::activate_queue(vring *queue) > +{ > + // Tell host about pfn > + // TODO: Yak, this is a bug in the design, on large memory we'll have > PFNs > 32 bit > + // Dor to notify Rusty > + virtio_conf_writel(VIRTIO_PCI_QUEUE_PFN, (u32)(queue->get_paddr() >> > VIRTIO_PCI_QUEUE_ADDR_SHIFT)); > +} > + > +void virtio_legacy_pci_device::activate_queue(u64 phys) > +{ > + // Tell host about pfn > + u64 pfn = phys >> VIRTIO_PCI_QUEUE_ADDR_SHIFT; > + // A bug in virtio's design... on large memory, this can actually > happen > + assert(pfn <= std::numeric_limits<u32>::max()); > + virtio_conf_writel(VIRTIO_PCI_QUEUE_PFN, (u32)pfn); > +} > + > +void virtio_legacy_pci_device::select_queue(int queue) > +{ > + virtio_conf_writew(VIRTIO_PCI_QUEUE_SEL, queue); > +} > + > +u16 virtio_legacy_pci_device::get_queue_size() > +{ > + return virtio_conf_readw(VIRTIO_PCI_QUEUE_NUM); > +} > + > +bool virtio_legacy_pci_device::get_virtio_config_bit(u32 offset, int bit) > +{ > + return virtio_conf_readl(offset) & (1 << bit); > +} > + > +void virtio_legacy_pci_device::set_virtio_config_bit(u32 offset, int bit, > bool on) > +{ > + u32 val = virtio_conf_readl(offset); > + u32 newval = ( val & ~(1 << bit) ) | ((int)(on)<<bit); > + virtio_conf_writel(offset, newval); > +} > + > +u64 virtio_legacy_pci_device::get_available_features() > +{ > + return virtio_conf_readl(VIRTIO_PCI_HOST_FEATURES); > +} > + > +bool virtio_legacy_pci_device::get_available_feature_bit(int bit) > +{ > + return get_virtio_config_bit(VIRTIO_PCI_HOST_FEATURES, bit); > +} > + > +void virtio_legacy_pci_device::set_enabled_features(u64 features) > +{ > + virtio_conf_writel(VIRTIO_PCI_GUEST_FEATURES, (u32)features); > +} > + > +void virtio_legacy_pci_device::set_enabled_feature_bit(int bit, bool on) > +{ > + set_virtio_config_bit(VIRTIO_PCI_GUEST_FEATURES, bit, on); > +} > + > +u64 virtio_legacy_pci_device::get_enabled_features() > +{ > + return virtio_conf_readl(VIRTIO_PCI_GUEST_FEATURES); > +} > + > +bool virtio_legacy_pci_device::get_enabled_feature_bit(int bit) > +{ > + return get_virtio_config_bit(VIRTIO_PCI_GUEST_FEATURES, bit); > +} > + > +u8 virtio_legacy_pci_device::get_status() > +{ > + return virtio_conf_readb(VIRTIO_PCI_STATUS); > +} > + > +void virtio_legacy_pci_device::set_status(u8 status) > +{ > + virtio_conf_writeb(VIRTIO_PCI_STATUS, status); > +} > + > +u8 virtio_legacy_pci_device::read_config(u32 offset) > +{ > + return _bar1->readb(offset); > +} > + > +void virtio_legacy_pci_device::write_config(u32 offset, u8 byte) > +{ > + _bar1->writeb(offset, byte); > +} > + > +void virtio_legacy_pci_device::dump_config() > +{ > + u8 B, D, F; > + _dev->get_bdf(B, D, F); > + > + _dev->dump_config(); > + virtio_d("%s [%x:%x.%x] vid:id=%x:%x", get_name().c_str(), > + (u16)B, (u16)D, (u16)F, > + _dev.get_vendor_id(), > + _dev.get_device_id()); > +} > + > +void virtio_legacy_pci_device::init() > +{ > + bool status = parse_pci_config(); > + assert(status); > + > + _dev->set_bus_master(true); > + > + _dev->msix_enable(); > +} > + > +void virtio_legacy_pci_device::register_interrupt(interrupt_factory > irq_factory) > +{ > + if (irq_factory.register_msi_bindings && _dev->is_msix()) { > + irq_factory.register_msi_bindings(_msi); > + } else { > + _irq.reset(irq_factory.create_pci_interrupt(*_dev)); > + } > +} > + > +void virtio_legacy_pci_device::register_interrupt(unsigned int queue, > std::function<void(void)> handler) > +{ > + // OSv's generic virtio driver has already set the device to msix, > and set > + // the VIRTIO_MSI_QUEUE_VECTOR of its queue to its number. > + assert(_dev->is_msix()); > + virtio_conf_writew(virtio::VIRTIO_PCI_QUEUE_SEL, queue); > + assert(virtio_conf_readw(virtio::VIRTIO_MSI_QUEUE_VECTOR) == queue); > + if (!_dev->is_msix_enabled()) { > + _dev->msix_enable(); > + } > + auto vectors = _msi.request_vectors(1); > + assert(vectors.size() == 1); > + auto vec = vectors[0]; > + // TODO: in _msi.easy_register() we also have code for moving the > + // interrupt's affinity to where the handling thread is. We should > + // probably do this here too. > + _msi.assign_isr(vec, handler); > + auto ok = _msi.setup_entry(queue, vec); > + assert(ok); > + vec->msix_unmask_entries(); > +} > + > +u8 virtio_legacy_pci_device::ack_irq() > +{ > + return virtio_conf_readb(VIRTIO_PCI_ISR); > +} > + > +bool virtio_legacy_pci_device::parse_pci_config() > +{ > + // Test whether bar1 is present > + _bar1 = _dev->get_bar(1); > + if (_bar1 == nullptr) { > + return false; > + } > + > + // Check ABI version > + u8 rev = _dev->get_revision_id(); > + if (rev != VIRTIO_PCI_ABI_VERSION) { > + virtio_e("Wrong virtio revision=%x", rev); > + return false; > + } > + > + // Check device ID > + u16 dev_id = _dev->get_device_id(); > + if ((dev_id < VIRTIO_PCI_ID_MIN) || (dev_id > VIRTIO_PCI_ID_MAX)) { > + virtio_e("Wrong virtio dev id %x", dev_id); > + return false; > + } > + > + return true; > +} > + > +virtio_device* create_virtio_device(pci::device *dev) { > + //TODO Read PCI device configuration to create instance > + // of virtio_modern_pci_device or virtio_legacy_pci_device > + return new virtio_legacy_pci_device(dev); > +} > + > +} > diff --git a/drivers/virtio-pci-device.hh b/drivers/virtio-pci-device.hh > new file mode 100644 > index 00000000..76827fc8 > --- /dev/null > +++ b/drivers/virtio-pci-device.hh > @@ -0,0 +1,150 @@ > +/* > + * Copyright (C) 2019 Waldemar Kozaczuk. > + * > + * This work is open source software, licensed under the terms of the > + * BSD license as described in the LICENSE file in the top-level > directory. > + */ > + > +#ifndef VIRTIO_PCI_DEVICE_HH > +#define VIRTIO_PCI_DEVICE_HH > + > +#include <osv/pci.hh> > +#include <osv/interrupt.hh> > +#include <osv/msi.hh> > + > +#include "virtio-device.hh" > + > +namespace virtio { > + > +enum VIRTIO_PCI_CONFIG { > + /* A 32-bit r/o bitmask of the features supported by the host */ > + VIRTIO_PCI_HOST_FEATURES = 0, > + /* A 32-bit r/o bitmask of the features supported by the host */ > + VIRTIO_PCI_HOST_FEATURES_HIGH = 1, > + /* A 32-bit r/w bitmask of features activated by the guest */ > + VIRTIO_PCI_GUEST_FEATURES = 4, > + /* A 32-bit r/w PFN for the currently selected queue */ > + VIRTIO_PCI_QUEUE_PFN = 8, > + /* A 16-bit r/o queue size for the currently selected queue */ > + VIRTIO_PCI_QUEUE_NUM = 12, > + /* A 16-bit r/w queue selector */ > + VIRTIO_PCI_QUEUE_SEL = 14, > + /* A 16-bit r/w queue notifier */ > + VIRTIO_PCI_QUEUE_NOTIFY = 16, > + /* An 8-bit device status register. */ > + VIRTIO_PCI_STATUS = 18, > + /* An 8-bit r/o interrupt status register. Reading the value will > return the > + * current contents of the ISR and will also clear it. This is > effectively > + * a read-and-acknowledge. */ > + VIRTIO_PCI_ISR = 19, > + /* The bit of the ISR which indicates a device configuration change. > */ > + VIRTIO_PCI_ISR_CONFIG = 0x2, > + /* MSI-X registers: only enabled if MSI-X is enabled. */ > + /* A 16-bit vector for configuration changes. */ > + VIRTIO_MSI_CONFIG_VECTOR = 20, > + /* A 16-bit vector for selected queue notifications. */ > + VIRTIO_MSI_QUEUE_VECTOR = 22, > + /* Vector value used to disable MSI for queue */ > + VIRTIO_MSI_NO_VECTOR = 0xffff, > + /* Virtio ABI version, this must match exactly */ > + VIRTIO_PCI_ABI_VERSION = 0, > + /* How many bits to shift physical queue address written to QUEUE_PFN. > + * 12 is historical, and due to x86 page size. */ > + VIRTIO_PCI_QUEUE_ADDR_SHIFT = 12, > + /* The alignment to use between consumer and producer parts of vring. > + * x86 pagesize again. */ > + VIRTIO_PCI_VRING_ALIGN = 4096, > + VIRTIO_PCI_ID_MIN = 0x1000, > + VIRTIO_PCI_ID_MAX = 0x103f, > +}; > + > +class virtio_pci_device : public virtio_device { > +public: > + explicit virtio_pci_device(pci::device *dev); > + ~virtio_pci_device(); > + > + virtual hw_device_id get_id() { return _dev->get_id(); } > + virtual void print() { _dev->print(); } > + virtual void reset() { _dev->reset(); } > + > + bool is_attached() { return _dev->is_attached(); } > + void set_attached() { _dev->set_attached(); } > + > + size_t get_vring_alignment() { return VIRTIO_PCI_VRING_ALIGN; } > + > +protected: > + pci::device *_dev; > + interrupt_manager _msi; > + std::unique_ptr<pci_interrupt> _irq; > +}; > + > +class virtio_legacy_pci_device : public virtio_pci_device { > +public: > + explicit virtio_legacy_pci_device(pci::device *dev); > + ~virtio_legacy_pci_device() {} > + > + virtual void init(); > + virtual void register_interrupt(interrupt_factory irq_factory); > + virtual void register_interrupt(unsigned int queue, > std::function<void(void)> handler); > + > + virtual void select_queue(int queue); > + virtual u16 get_queue_size(); > + virtual void setup_queue(int queue); > + virtual void activate_queue(vring *queue); > + virtual void activate_queue(u64 phys); > + virtual void kick_queue(int queue); > + > + virtual u64 get_available_features(); > + virtual bool get_available_feature_bit(int bit); > + > + virtual void set_enabled_features(u64 features); > + virtual void set_enabled_feature_bit(int bit, bool on); > + virtual u64 get_enabled_features(); > + virtual bool get_enabled_feature_bit(int bit); > + > + virtual u8 get_status(); > + virtual void set_status(u8 status); > + > + virtual u8 read_config(u32 offset); > + virtual void write_config(u32 offset, u8 byte); > + virtual void dump_config(); > + virtual u8 ack_irq(); > + > + // The remaining space is defined by each driver as the per-driver > + // configuration space > + virtual int config_offset() { return (_dev->is_msix_enabled())? 24 : > 20;} > + > + virtual bool is_modern() { return false; }; > +private: > + bool parse_pci_config(); > + > + // Access the virtio conf address space set by pci bar 1 > + bool get_virtio_config_bit(u32 offset, int bit); > + void set_virtio_config_bit(u32 offset, int bit, bool on); > + > + // Access virtio config space > + u8 virtio_conf_readb(u32 offset) { return _bar1->readb(offset);}; > + u16 virtio_conf_readw(u32 offset) { return _bar1->readw(offset);}; > + u32 virtio_conf_readl(u32 offset) { return _bar1->readl(offset);}; > + void virtio_conf_writeb(u32 offset, u8 val) { _bar1->writeb(offset, > val);}; > + void virtio_conf_writew(u32 offset, u16 val) { _bar1->writew(offset, > val);}; > + void virtio_conf_writel(u32 offset, u32 val) { _bar1->writel(offset, > val);}; > + > + pci::bar* _bar1; > +}; > + > +class virtio_modern_pci_device : public virtio_pci_device { > +public: > + explicit virtio_modern_pci_device(pci::device *dev); > + ~virtio_modern_pci_device(); > + > + virtual bool is_modern() { return true; }; > +}; > + > +// Creates instance of virtio_modern_pci_device or > virtio_legacy_pci_device > +// by reading configuration from PCI device > +virtio_device* create_virtio_device(pci::device *dev); > + > +} > + > +#endif //VIRTIO_PCI_DEVICE_HH > diff --git a/drivers/virtio-rng.cc b/drivers/virtio-rng.cc > index d506cb2a..1e13c3e4 100644 > --- a/drivers/virtio-rng.cc > +++ b/drivers/virtio-rng.cc > @@ -36,11 +36,19 @@ static int virtio_rng_read(void *buf, int size) > } > > namespace virtio { > -rng::rng(pci::device& pci_dev) > - : virtio_driver(pci_dev) > - , _irq(pci_dev, [&] { return ack_irq(); }, [&] { handle_irq(); }) > +rng::rng(virtio_device& dev) > + : virtio_driver(dev) > , _thread(sched::thread::make([&] { worker(); }, > sched::thread::attr().name("virtio-rng"))) > { > + interrupt_factory int_factory; > + int_factory.create_pci_interrupt = [this](pci::device &pci_dev) { > + return new pci_interrupt( > + pci_dev, > + [=] { return this->ack_irq(); }, > + [=] { this->handle_irq(); }); > + }; > + _dev.register_interrupt(int_factory); > + > _queue = get_virt_queue(0); > > add_dev_status(VIRTIO_CONFIG_S_DRIVER_OK); > @@ -79,7 +87,7 @@ void rng::handle_irq() > > bool rng::ack_irq() > { > - return virtio_conf_readb(VIRTIO_PCI_ISR); > + return _dev.ack_irq(); > } > > void rng::worker() > diff --git a/drivers/virtio-rng.hh b/drivers/virtio-rng.hh > index 1790665b..e704fea4 100644 > --- a/drivers/virtio-rng.hh > +++ b/drivers/virtio-rng.hh > @@ -26,7 +26,7 @@ public: > VIRTIO_RNG_DEVICE_ID = 0x1005, > }; > > - explicit rng(pci::device& dev); > + explicit rng(virtio_device& dev); > virtual ~rng(); > > virtual std::string get_name() const { return "virtio-rng"; } > @@ -44,7 +44,6 @@ private: > > static const size_t _pool_size = 64; > std::vector<char> _entropy; > - pci_interrupt _irq; > std::unique_ptr<sched::thread> _thread; > condvar _producer; > condvar _consumer; > diff --git a/drivers/virtio-scsi.cc b/drivers/virtio-scsi.cc > index f3f94bd8..b053ae7e 100644 > --- a/drivers/virtio-scsi.cc > +++ b/drivers/virtio-scsi.cc > @@ -12,7 +12,6 @@ > #include <osv/mempool.hh> > #include <osv/sched.hh> > #include <osv/interrupt.hh> > -#include "drivers/pci-device.hh" > #include "drivers/virtio.hh" > #include "drivers/virtio-scsi.hh" > #include "drivers/scsi-common.hh" > @@ -129,7 +128,7 @@ void scsi::add_lun(u16 target, u16 lun) > > bool scsi::ack_irq() > { > - auto isr = virtio_conf_readb(VIRTIO_PCI_ISR); > + auto isr = _dev.ack_irq(); > auto queue = get_virt_queue(VIRTIO_SCSI_QUEUE_REQ); > > if (isr) { > @@ -141,7 +140,7 @@ bool scsi::ack_irq() > > } > > -scsi::scsi(pci::device& dev) > +scsi::scsi(virtio_device& dev) > : virtio_driver(dev) > { > > @@ -157,17 +156,22 @@ scsi::scsi(pci::device& dev) > t->start(); > auto queue = get_virt_queue(VIRTIO_SCSI_QUEUE_REQ); > > - if (dev.is_msix()) { > - _msi.easy_register({ > - { VIRTIO_SCSI_QUEUE_CTRL, nullptr, nullptr }, > - { VIRTIO_SCSI_QUEUE_EVT, nullptr, nullptr }, > - { VIRTIO_SCSI_QUEUE_REQ, [=] { > queue->disable_interrupts(); }, t }, > + interrupt_factory int_factory; > + int_factory.register_msi_bindings = [queue,t](interrupt_manager &msi) > { > + msi.easy_register({ > + { VIRTIO_SCSI_QUEUE_CTRL, nullptr, nullptr }, > + { VIRTIO_SCSI_QUEUE_EVT, nullptr, nullptr }, > + { VIRTIO_SCSI_QUEUE_REQ, [=] { queue->disable_interrupts(); }, > t }, > }); > - } else { > - _irq.reset(new pci_interrupt(dev, > - [=] { return this->ack_irq(); }, > - [=] { t->wake(); })); > - } > + }; > + > + int_factory.create_pci_interrupt = [this,t](pci::device &pci_dev) { > + return new pci_interrupt( > + pci_dev, > + [=] { return this->ack_irq(); }, > + [=] { t->wake(); }); > + }; > + _dev.register_interrupt(int_factory); > > // Enable indirect descriptor > queue->set_use_indirect(true); > @@ -184,7 +188,7 @@ scsi::~scsi() > > void scsi::read_config() > { > - virtio_conf_read(virtio_pci_config_offset(), &_config, > sizeof(_config)); > + virtio_conf_read(_dev.config_offset(), &_config, sizeof(_config)); > config.max_lun = _config.max_lun; > config.max_target = _config.max_target; > } > diff --git a/drivers/virtio-scsi.hh b/drivers/virtio-scsi.hh > index 205ccb40..30c3d9e0 100644 > --- a/drivers/virtio-scsi.hh > +++ b/drivers/virtio-scsi.hh > @@ -8,7 +8,6 @@ > #ifndef VIRTIO_SCSI_DRIVER_H > #define VIRTIO_SCSI_DRIVER_H > #include "drivers/virtio.hh" > -#include "drivers/pci-device.hh" > #include "drivers/scsi-common.hh" > #include <osv/bio.h> > #include <osv/types.h> > @@ -145,7 +144,7 @@ public: > }; > > > - scsi(pci::device& dev); > + scsi(virtio_device& dev); > ~scsi(); > > virtual std::string get_name() const { return _driver_name; } > @@ -174,8 +173,6 @@ private: > std::string _driver_name; > scsi_config _config; > > - std::unique_ptr<pci_interrupt> _irq; > - > //maintains the virtio instance number for multiple drives > static int _instance; > int _id; > diff --git a/drivers/virtio-vring.cc b/drivers/virtio-vring.cc > index 5f89fb5d..3f0e7e25 100644 > --- a/drivers/virtio-vring.cc > +++ b/drivers/virtio-vring.cc > @@ -38,12 +38,13 @@ TRACEPOINT(trace_vring_get_buf_ret, "vring=%p > _avail_count %d", void*, int); > > namespace virtio { > > - vring::vring(virtio_driver* const dev, u16 num, u16 q_index) > + vring::vring(virtio_driver* const driver, u16 num, u16 q_index) > { > - _dev = dev; > + _driver = driver; > _q_index = q_index; > // Alloc enough pages for the vring... > - unsigned sz = VIRTIO_ALIGN(vring::get_size(num, > VIRTIO_PCI_VRING_ALIGN)); > + size_t alignment = driver->get_vring_alignment(); > + size_t sz = VIRTIO_ALIGN(vring::get_size(num, alignment), > alignment); > _vring_ptr = memory::alloc_phys_contiguous_aligned(sz, 4096); > memset(_vring_ptr, 0, sz); > > @@ -53,7 +54,7 @@ namespace virtio { > _desc = (vring_desc*)_vring_ptr; > _avail = (vring_avail*)(_vring_ptr + num * sizeof(vring_desc)); > _used = (vring_used*)(((unsigned long)&_avail->_ring[num] + > - sizeof(u16) + VIRTIO_PCI_VRING_ALIGN - 1) & > ~(VIRTIO_PCI_VRING_ALIGN - 1)); > + sizeof(u16) + alignment - 1) & ~(alignment - 1)); > > // initialize the next pointer within the available ring > for (int i = 0; i < num; i++) _desc[i]._next = i + 1; > @@ -102,7 +103,7 @@ namespace virtio { > inline bool vring::use_indirect(int desc_needed) > { > return _use_indirect && > - _dev->get_indirect_buf_cap() && > + _driver->get_indirect_buf_cap() && > // don't let the posting fail due to low available buffers > number > (desc_needed > _avail_count || > // no need to use indirect for a single descriptor > @@ -280,7 +281,7 @@ namespace virtio { > > ... -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
