Re: [Qemu-devel] [Bug 1384892] Re: RTL8168 NIC VFIO not working anymore since QEMU 2.1
Hi *, it seems we could finally fix this bug: https://bugs.launchpad.net/qemu/+bug/1384892 with the following patches: https://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg07260.html Regards, Thorsten Am 22.06.2016 um 13:10 schrieb T. Huth: > Alex' patch has been included here: > http://git.qemu.org/?p=qemu.git;a=commitdiff;h=69970fcef937bddd7f745 > ... so I assume this ticket could be closed now? > -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1384892 Title: RTL8168 NIC VFIO not working anymore since QEMU 2.1 Status in QEMU: New Bug description: After upgrading QEMU from 2.0 to 2.1 (and libiscsi from 1.7.0 to 1.12 as a dependency) my two RTL8168 NICs stopped working. The NICs do not respond to any command and even the LEDs on the network connection turn off, a few seconds after the VM started. To get them back running I had to downgrade to 2.0 and restart the system. Unfortunately, I have no clue what to do or how to debug this problem since there are no specific errors logged. I tried two different VMs: Debian Wheezy and IPFire (see attachment for further details). The QEMU 2.1 changelog states "Support for RTL8168 NICs." so there were some major changes done, I guess. On the IPFire guest the kernel log shows many of these lines: r8169 :00:07.0 green1: rtl_eriar_cond == 1 (loop: 100, delay: 100) r8169 :00:07.0 green1: rtl_phy_reset_cond == 1 (loop: 100, delay: 1) On the Debian guest there is only: r8169 :00:07.0: firmware: agent loaded rtl_nic/rtl8168e-3.fw into memory r8169 :00:07.0: lan0: link down ADDRCONF(NETDEV_UP): lan0: link is not ready The commandline for IPFire can be seen in the attachment. It is the same for Debian. There are also the complete kernel logs for the working (2.0) and non-working (2.1) cases. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1384892/+subscriptions
Re: [Qemu-devel] [PATCH v2 0/2] memory: Convert skip_dump to ram_device and avoid memcpy
I tested these https://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg05900.html https://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg05901.html https://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg05902.html together with [Qemu-devel] [PULL 18/19] vfio/pci: Fix vfio_rtl8168_quirk_data_read add https://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg03911.html on top of qemu-stable-2.7.0 successfully on IPFire (Linux) and Windows 10 x64 with the Realtek driver. I suggest all 3 commits (msg05901, msg05902, and msg03911) should become serious candidates for qemu-2.7.1. Regards, Thorsten Am 25.10.2016 um 20:17 schrieb Alex Williamson: v2: retain ram_device flag to avoid extra cache miss, per Paolo. Paolo, posting for completeness, I can merge through my tree if you want to Ack. Thanks, Alex --- Alex Williamson (2): memory: Replace skip_dump flag with "ram_device" memory: Don't use memcpy for ram_device regions hw/vfio/common.c |9 ++ hw/vfio/spapr.c |2 + include/exec/memory.h | 47 - memory.c | 80 +++-- memory_mapping.c |2 + trace-events |2 + 6 files changed, 116 insertions(+), 26 deletions(-)
Re: [Qemu-devel] [RFC PATCH] memory: Don't use memcpy for ram marked as skip_dump
Am 22.10.2016 um 17:09 schrieb Alex Williamson: On Sat, 22 Oct 2016 11:10:59 +0200 Thorsten Kohfeldt <thorsten.kohfe...@gmx.de> wrote: Hi *, this came to my mind when browsing the sources in the patch's vicinity. It is just a collection of thoughts, so please don't feel offended about how I phrased certain statements. Questions Is mr->opaque always unused ? i.e. should we assert NULL before assignment ? Really I think it's probably unnecessary even to check the mr->ops pointer. I don't see a path where we can have both mr->ram true and either mr->ops or mr->opaque set to anything other than defaults. However, mr->opaque is consumed by mr->ops, so if mr->ops is set to _mem_ops, then we know opaque is unused. mr->ops vs. mr->iommu_ops i.e. can we set mr->opaque if mr->iommu_ops is not NULL ? or should we even assert mr->iommu_ops NULL, because a skip_dump mr is not supposed to be addr-translated again ? Show me where a MemoryRegion with iommu_ops can be mr->ram. I probably can't. I was merely expressing a gut feeling. There is a _shared_ 'io_mem_unassigned' mr. Are we in danger to modify it ? Would that hurt ? No. Are we generally switching mrops "back and forth", or is this a first ? mr->ram is expected to have mr->ops set to the default unassigned handler via memory_region_initfn(). All MemoryRegions start this way and have their ops reassigned for certain initialization types. Can we afford not to implement size 8 or should we rather force 8 -> 2*4 by setting specific mrop flags if possible ? Or just hard code case 8: handle longword[1]; fallthru 4: The memory API code will automatically split a qword into multiple dword accesses. I thought that would only happen if any of the mrop constraint flags were set ? Like min size = 1, max_size = 4 ? Are those default or propagated to the new op context from somewhere else (i.e. dealt with before the new mrop callbacks are invoked) ? When/where is memory_region_set_skip_dump() (supposed to be) called ? Use the source, it's called by vfio code after initializing the MemoryRegion to indicate that memory dumps should skip this region as it's backed by a physical device. I should have phrased that completely differently. I was wondering whether any future user of the skip_dump mechanism would be aware enough of the new implications. But I see Paolo has suggested a renaming, skip_dump to device_memory. That looks good enough to me for raising that awareness. Recommendations Add comment in skip_dump_mem_read/write NOT to support 64b, because an error will not be recognised unless specific HW is present (maybe even give examples of specific HW combinations). It's not clear to me that this is the correct long term behavior. RTL does not support qword access, but other devices might. The expectation would be that a guest driver does not use accesses beyond the capabilities of the device. It's convenient that limiting to dword accesses fixes xp memory inspection in the monitor, but that's not a sufficient reason not to implement qword should we want it for other devices. Add comments at more code locations that are break-subpage/mmap-sensible. For example default vfio slow path mrops should also not support 64b ? Nope, same. Add a trace message for each mrop. Yes, this is on my todo list post-RFC. Additional patch suggestion(s) During former investigations I found it not easy to identify runtime active/current mrops per mr, so: Add .name to mr->ops/iommu_ops to be able to mon-list them together with mr names OR (this questions flag reuse/overlay) skip_dump_flag should rather get a brother so (unamed) ops can be easily concluded for listing ? But is this the only mr<->mrop ambiguosity ? This is beyond the scope of this patch, you're welcome to pursue. And would I find people willing to review ? See, I did not have that much resonance for my recent patch initiative 'hmp: Improve 'info mtree' with optional parm for mapinfo'. Most importantly, you haven't indicated whether this patch resolves the issues you've been having. Thanks, I planned on testing the reviewed/revised patch, but anyway, I will only find time for testing later this week. Alex Regards, Thorsten
Re: [Qemu-devel] [RFC PATCH] memory: Don't use memcpy for ram marked as skip_dump
Hi *, this came to my mind when browsing the sources in the patch's vicinity. It is just a collection of thoughts, so please don't feel offended about how I phrased certain statements. Questions Is mr->opaque always unused ? i.e. should we assert NULL before assignment ? mr->ops vs. mr->iommu_ops i.e. can we set mr->opaque if mr->iommu_ops is not NULL ? or should we even assert mr->iommu_ops NULL, because a skip_dump mr is not supposed to be addr-translated again ? There is a _shared_ 'io_mem_unassigned' mr. Are we in danger to modify it ? Would that hurt ? Are we generally switching mrops "back and forth", or is this a first ? Can we afford not to implement size 8 or should we rather force 8 -> 2*4 by setting specific mrop flags if possible ? Or just hard code case 8: handle longword[1]; fallthru 4: When/where is memory_region_set_skip_dump() (supposed to be) called ? Recommendations Add comment in skip_dump_mem_read/write NOT to support 64b, because an error will not be recognised unless specific HW is present (maybe even give examples of specific HW combinations). Add comments at more code locations that are break-subpage/mmap-sensible. For example default vfio slow path mrops should also not support 64b ? Add a trace message for each mrop. Additional patch suggestion(s) During former investigations I found it not easy to identify runtime active/current mrops per mr, so: Add .name to mr->ops/iommu_ops to be able to mon-list them together with mr names OR (this questions flag reuse/overlay) skip_dump_flag should rather get a brother so (unamed) ops can be easily concluded for listing ? But is this the only mr<->mrop ambiguosity ? Regards, Thorsten Am 21.10.2016 um 19:11 schrieb Alex Williamson: With a vfio assigned device we lay down a base MemoryRegion registered as an IO region, giving us read & write accessors. If the region supports mmap, we lay down a higher priority sub-region MemoryRegion on top of the base layer initialized as a RAM pointer to the mmap. Finally, if we have any quirks for the device (ie. address ranges that need additional virtualization support), we put another IO sub-region on top of the mmap MemoryRegion. When this is flattened, we now potentially have sub-page mmap MemoryRegions exposed which cannot be directly mapped through KVM. This is as expected, but a subtle detail of this is that we end up with two different access mechanisms through QEMU. If we disable the mmap MemoryRegion, we make use of the IO MemoryRegion and service accesses using pread and pwrite to the vfio device file descriptor. If the mmap MemoryRegion is enabled and we end up in one of these sub-page gaps, QEMU handles the access as RAM, using memcpy to the mmap. Using the mmap through QEMU is a subtle difference, but it's fine, the problem is the memcpy. My assumption is that memcpy makes no guarantees about access width and potentially uses all sorts of optimized memory transfers that are not intended for talking to device MMIO. It turns out that this has been a problem for Realtek NIC assignment, which has such a quirk that creates a sub-page mmap MemoryRegion access. My proposal to fix this is to leverage the skip_dump flag that we already use for special handling of these device-backed MMIO ranges. When skip_dump is set for a MemoryRegion, we mark memory access as non-direct and automatically insert MemoryRegionOps with basic semantics to handle accesses. Note that we only enable dword accesses because some devices don't particularly like qword accesses (Realtek NICs are such a device). This actually also fixes memory inspection via the xp command in the QEMU monitor as well. Please comment. Is this the best way to solve this problem? Thanks Reported-by: Thorsten Kohfeldt <thorsten.kohfe...@gmx.de> Signed-off-by: Alex Williamson <alex.william...@redhat.com> --- include/exec/memory.h |6 -- memory.c | 44 2 files changed, 48 insertions(+), 2 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index 10d7eac..a4c3acf 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -1464,9 +1464,11 @@ void *qemu_map_ram_ptr(RAMBlock *ram_block, ram_addr_t addr); static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write) { if (is_write) { -return memory_region_is_ram(mr) && !mr->readonly; +return memory_region_is_ram(mr) && + !mr->readonly && !memory_region_is_skip_dump(mr); } else { -return memory_region_is_ram(mr) || memory_region_is_romd(mr); +return (memory_region_is_ram(mr) && !memory_region_is_skip_dump(mr)) || + memory_region_is_romd(mr); } } diff --git a/memory.c b/memory.c index 58f9269..7ed7ca9 100644 --- a/memory.c +++ b/memory.c @@ -1136,6 +1136,46 @@ const MemoryR
Re: [Qemu-devel] [PATCH] vfio: Fix vfio_rtl8168_quirk_data_read address offset
Am 10.10.2016 um 17:18 schrieb Alex Williamson: On Sun, 9 Oct 2016 19:56:03 +0200 Thorsten Kohfeldt <thorsten.kohfe...@gmx.de> wrote: From: Thorsten Kohfeldt <mailto__qemu-de...@nongnu.org> Date: Sat, 24 Sep 2016 20:43:20 +0200 Subject: [PATCH] vfio: Fix vfio_rtl8168_quirk_data_read address offset Introductory comment for rtl8168 VFIO MSI-X quirk states: At BAR2 offset 0x70 there is a dword data register, offset 0x74 is a dword address register. vfio: vfio_bar_read(:05:00.0:BAR2+0x70, 4) = 0xfee00398 // read data Thus, correct offset for data read is 0x70, but function vfio_rtl8168_quirk_data_read() wrongfully uses offset 0x74. Signed-off-by: Thorsten Kohfeldt <mailto__qemu-de...@nongnu.org> I need a real email address for these, can I replace this with your gmx.de email? Thanks, Alex Yes, thank you for taking over from here. Regards, Thorsten --- hw/vfio/pci-quirks.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c index bec694c..1e97bc4 100644 --- a/hw/vfio/pci-quirks.c +++ b/hw/vfio/pci-quirks.c @@ -898,7 +898,7 @@ static uint64_t vfio_rtl8168_quirk_data_read(void *opaque, { VFIOrtl8168Quirk *rtl = opaque; VFIOPCIDevice *vdev = rtl->vdev; -uint64_t data = vfio_region_read(>bars[2].region, addr + 0x74, size); +uint64_t data = vfio_region_read(>bars[2].region, addr + 0x70, size); if (rtl->enabled && (vdev->pdev.cap_present & QEMU_PCI_CAP_MSIX)) { hwaddr offset = rtl->addr & 0xfff;
[Qemu-devel] [PATCH] vfio: Fix vfio_rtl8168_quirk_data_read address offset
From: Thorsten Kohfeldt <mailto__qemu-de...@nongnu.org> Date: Sat, 24 Sep 2016 20:43:20 +0200 Subject: [PATCH] vfio: Fix vfio_rtl8168_quirk_data_read address offset Introductory comment for rtl8168 VFIO MSI-X quirk states: At BAR2 offset 0x70 there is a dword data register, offset 0x74 is a dword address register. vfio: vfio_bar_read(:05:00.0:BAR2+0x70, 4) = 0xfee00398 // read data Thus, correct offset for data read is 0x70, but function vfio_rtl8168_quirk_data_read() wrongfully uses offset 0x74. Signed-off-by: Thorsten Kohfeldt <mailto__qemu-de...@nongnu.org> --- hw/vfio/pci-quirks.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c index bec694c..1e97bc4 100644 --- a/hw/vfio/pci-quirks.c +++ b/hw/vfio/pci-quirks.c @@ -898,7 +898,7 @@ static uint64_t vfio_rtl8168_quirk_data_read(void *opaque, { VFIOrtl8168Quirk *rtl = opaque; VFIOPCIDevice *vdev = rtl->vdev; -uint64_t data = vfio_region_read(>bars[2].region, addr + 0x74, size); +uint64_t data = vfio_region_read(>bars[2].region, addr + 0x70, size); if (rtl->enabled && (vdev->pdev.cap_present & QEMU_PCI_CAP_MSIX)) { hwaddr offset = rtl->addr & 0xfff; -- 2.5.5
Re: [Qemu-devel] [PATCH v2] vfio: Add support for mmapping sub-page MMIO BARs
(Re-post, as my mail client somehow made my previous post attach to the wrong thread. I do not mean to spam y'all, but maybe my previous mail got lost in your filters ... ... as I have not yet seen any answer to my questions/remarks. ) > On 2016/9/14 6:55, Alex Williamson wrote: > > [cc +Paolo] > >> On Thu, 11 Aug 2016 19:05:57 +0800 >> Yongji Xie wrote: >> >> Now the kernel commit 05f0c03fbac1 ("vfio-pci: Allow to mmap >> sub-page MMIO BARs if the mmio page is exclusive") allows VFIO >> to mmap sub-page BARs. This is the corresponding QEMU patch. Immediate questions first: It seems that mentioned commit will be part of Kernel 4.8 ? But as far as I can judge this change should also cooperate with older/existing kernels (which then just have qemu behave as before) ? (For my actual point of interrest related to this patch please see further down.) >> With those patches applied, we could passthrough sub-page BARs >> to guest, which can help to improve IO performance for some devices. >> >> In this patch, we expand MemoryRegions of these sub-page >> MMIO BARs to PAGE_SIZE in vfio_pci_write_config(), so that >> the BARs could be passed to KVM ioctl KVM_SET_USER_MEMORY_REGION >> with a valid size. The expanding size will be recovered when >> the base address of sub-page BAR is changed and not page aligned >> any more in guest. And we also set the priority of these BARs' >> memory regions to zero in case of overlap with BARs which share >> the same page with sub-page BARs in guest. >> >> Signed-off-by: Yongji Xie >> --- >> hw/vfio/common.c |3 +-- >> hw/vfio/pci.c| 76 ++ >> 2 files changed, 77 insertions(+), 2 deletions(-) > > Hi Yongji, ... >> +mr = region->mem; >> +mmap_mr = >mmaps[0].mem; >> +memory_region_transaction_begin(); >> +if (memory_region_size(mr) < qemu_real_host_page_size) { >> +if (!(bar_addr & ~qemu_real_host_page_mask) && >> +memory_region_is_mapped(mr) && region->mmaps[0].mmap) { >> +/* Expand memory region to page size and set priority */ >> +memory_region_del_subregion(r->address_space, mr); >> +memory_region_set_size(mr, qemu_real_host_page_size); >> +memory_region_set_size(mmap_mr, qemu_real_host_page_size); >> +memory_region_add_subregion_overlap(r->address_space, >> +bar_addr, mr, 0); >> + } >> +} else { >> +/* This case would happen when guest rescan one PCI device */ >> +if (bar_addr & ~qemu_real_host_page_mask) { >> +/* Recover the size of memory region */ >> +memory_region_set_size(mr, r->size); >> +memory_region_set_size(mmap_mr, r->size); >> +} else if (memory_region_is_mapped(mr)) { >> +/* Set the priority of memory region to zero */ >> +memory_region_del_subregion(r->address_space, mr); >> +memory_region_add_subregion_overlap(r->address_space, >> +bar_addr, mr, 0); >> +} >> +} >> +memory_region_transaction_commit(); > > Paolo, as the reigning memory API expert, do you see any issues with > this? Expanding the size of a container MemoryRegion and the contained > mmap'd subregion and manipulating priorities so that we don't interfere > with other MemoryRegions. Since the following qemu commit function memory_region_add_subregion_overlap() actually has a misleading name: http://git.qemu.org/?p=qemu.git;a=blobdiff;f=memory.c;h=ac5236b51587ee397edd177502fc20ce159f2235;hp=9daac5ea2d9a9c83533880a812760683f6e09765;hb=b61359781958759317ee6fd1a45b59be0b7dbbe1;hpb=ab0a99560857302b60053c245d1231acbd976cd4 The sole thing that memory_region_add_subregion_overlap() now actually does differently from memory_region_add_subregion() is nothing else than setting the region's priority to a value of callers choice. The _default_ priority as chosen by memory_region_add_subregion() _is_ 0. So, explicitly choosing memory_region_add_subregion_overlap(... , 0) does nothing new. Or does it: Actually, _yes_, because I see Alex actually willing to discuss choice of memory region priorities related to VFIO and mmap. Why do I "invade" this thread ? I would like you to consider thinking twice about selecting proper priorities for _any_ mmap related region (i.e. also the aligned case), and here is why: (I will also make a statement related to region expansion for alignment.) First of all, I recently suggested a patch which can visualise what I write about subsequently: https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg01315.html (I would appreciate if somebody would review and thus get it merged.) As a general remark, the sub-page mmap case does not only occur when a 'small' BAR is encountered, it also occurs when a fully mmap-ed page is split by a 'small' VFIO quirk. Hi Alex, here we go
Re: [Qemu-devel] [PATCH v2] vfio: Add support for mmapping sub-page MMIO BARs
> On 2016/9/14 6:55, Alex Williamson wrote: > > [cc +Paolo] > >> On Thu, 11 Aug 2016 19:05:57 +0800 >> Yongji Xie wrote: >> >> Now the kernel commit 05f0c03fbac1 ("vfio-pci: Allow to mmap >> sub-page MMIO BARs if the mmio page is exclusive") allows VFIO >> to mmap sub-page BARs. This is the corresponding QEMU patch. Immediate questions first: It seems that mentioned commit will be part of Kernel 4.8 ? But as far as I can judge this change should also cooperate with older/existing kernels (which then just have qemu behave as before) ? (For my actual point of interrest related to this patch please see further down.) >> With those patches applied, we could passthrough sub-page BARs >> to guest, which can help to improve IO performance for some devices. >> >> In this patch, we expand MemoryRegions of these sub-page >> MMIO BARs to PAGE_SIZE in vfio_pci_write_config(), so that >> the BARs could be passed to KVM ioctl KVM_SET_USER_MEMORY_REGION >> with a valid size. The expanding size will be recovered when >> the base address of sub-page BAR is changed and not page aligned >> any more in guest. And we also set the priority of these BARs' >> memory regions to zero in case of overlap with BARs which share >> the same page with sub-page BARs in guest. >> >> Signed-off-by: Yongji Xie >> --- >> hw/vfio/common.c |3 +-- >> hw/vfio/pci.c| 76 ++ >> 2 files changed, 77 insertions(+), 2 deletions(-) > > Hi Yongji, ... >> +mr = region->mem; >> +mmap_mr = >mmaps[0].mem; >> +memory_region_transaction_begin(); >> +if (memory_region_size(mr) < qemu_real_host_page_size) { >> +if (!(bar_addr & ~qemu_real_host_page_mask) && >> +memory_region_is_mapped(mr) && region->mmaps[0].mmap) { >> +/* Expand memory region to page size and set priority */ >> +memory_region_del_subregion(r->address_space, mr); >> +memory_region_set_size(mr, qemu_real_host_page_size); >> +memory_region_set_size(mmap_mr, qemu_real_host_page_size); >> +memory_region_add_subregion_overlap(r->address_space, >> +bar_addr, mr, 0); >> + } >> +} else { >> +/* This case would happen when guest rescan one PCI device */ >> +if (bar_addr & ~qemu_real_host_page_mask) { >> +/* Recover the size of memory region */ >> +memory_region_set_size(mr, r->size); >> +memory_region_set_size(mmap_mr, r->size); >> +} else if (memory_region_is_mapped(mr)) { >> +/* Set the priority of memory region to zero */ >> +memory_region_del_subregion(r->address_space, mr); >> +memory_region_add_subregion_overlap(r->address_space, >> +bar_addr, mr, 0); >> +} >> +} >> +memory_region_transaction_commit(); > > Paolo, as the reigning memory API expert, do you see any issues with > this? Expanding the size of a container MemoryRegion and the contained > mmap'd subregion and manipulating priorities so that we don't interfere > with other MemoryRegions. Since the following qemu commit function memory_region_add_subregion_overlap() actually has a misleading name: http://git.qemu.org/?p=qemu.git;a=blobdiff;f=memory.c;h=ac5236b51587ee397edd177502fc20ce159f2235;hp=9daac5ea2d9a9c83533880a812760683f6e09765;hb=b61359781958759317ee6fd1a45b59be0b7dbbe1;hpb=ab0a99560857302b60053c245d1231acbd976cd4 The sole thing that memory_region_add_subregion_overlap() now actually does differently from memory_region_add_subregion() is nothing else than setting the region's priority to a value of callers choice. The _default_ priority as chosen by memory_region_add_subregion() _is_ 0. So, explicitly choosing memory_region_add_subregion_overlap(... , 0) does nothing new. Or does it: Actually, _yes_, because I see Alex actually willing to discuss choice of memory region priorities related to VFIO and mmap. Why do I "invade" this thread ? I would like you to consider thinking twice about selecting proper priorities for _any_ mmap related region (i.e. also the aligned case), and here is why: (I will also make a statement related to region expansion for alignment.) First of all, I recently suggested a patch which can visualise what I write about subsequently: https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg01315.html (I would appreciate if somebody would review and thus get it merged.) As a general remark, the sub-page mmap case does not only occur when a 'small' BAR is encountered, it also occurs when a fully mmap-ed page is split by a 'small' VFIO quirk. Hi Alex, here we go again about RTL8168 and its MSIX quirk. (Subsequently I also relate to/conclude for Yongji's patch.) Mentioned quirk cuts for certain RTL8168 models a full-page BAR right into 3 pieces, 0..qirkaddr-1, quirk and quirk+qsize..pagesize-1. What
Re: [Qemu-devel] [PATCH] hmp: Improve 'info mtree' with optional parm for mapinfo
Am 20.09.2016 um 20:46 schrieb Laszlo Ersek: On 09/20/16 20:23, Thorsten Kohfeldt wrote: Am 20.09.2016 um 02:51 schrieb Laszlo Ersek: I think it would make sense to work from the pre-rendered FlatView, if the information you can get out of it covers your needs. I assume you know by now that I do _not_ feel that it covers my needs. Okay. I guess you've made your point convincingly enough (although I'm just an interested bystander in this thread). I think the patch has a very low chance for regressions and it shouldn't draw in a big maintenance burden, so it could be considered "pure improvement" as-is. I'm not volunteering to do a full review. One thing I notice though is that the constification of memory_region_size()'s sole parameter could be / should be moved into a separate, "prep" patch. I am really not having the attitude of unwillingness to listen to suggestions. But this was my train of thoughts towards that split up (before I posted): 1) This (split up) moves the patch suite size from 1 file to 3 files. 2) It now requires an aditional cover letter, a trivial patch and the actual patch. 3) If the trivial patch was applied but not the actual patch, I'd have to maintain 2 variants of the actual patch to fit current and also older trees (with and without constant local mr pointer variable). And this comes on top since 2b9e35760a8ff5548f6789d048c6e6e3c22c70ee 4) Well, that would be even more, as I have to maintain 2 variants already anyway: V2.6/V2.7 and master. Having that stated, of course I am willing to follow your suggestion anyway to get this through if it is insisted upon. (I propose you wait for further feedback before reposting just with this change.) I'm gladly looking forward to decisive guidance.
Re: [Qemu-devel] [PATCH] hmp: Improve 'info mtree' with optional parm for mapinfo
Am 20.09.2016 um 09:56 schrieb Paolo Bonzini: ... dumping the flat-view would give you something much simpler: -0009 (RW): pc.ram 000a-000b (RW): vga-lowmem 000c-000f (R-): pc.ram @ 000c 0010-07ff (RW): pc.ram @ 0010 fd00-fdff (RW): vga.vram febc-febd (RW): e1000-mmio febf-febf0fff (RW): vga.mmio febf0400-febf041f (RW): vga ioports remapped febf0500-febf0515 (RW): bochs dispi interface febf0600-febf0607 (RW): qemu extended regs fec0-fec00fff (RW): ioapic fed0-fed003ff (RW): hpet fee0-feef (RW): apic-msi fffc- (R-): pc.bios How did you create this output ? I did not find any hint in monitor help or the qemu source tree. (either too many matches or no matches at all for my searches.) Is this what you envision to receive from this patch initiative ? This is less information of course, but it may good idea depending on _why_ you're doing that. Paolo I explained more of my motivation in a reply to Laszlo's mail, i.e. why I am really keen on exactly the kind of output I suggest with the patch. But as a compromise, I would be willing to add the flat view map as well. Still though, I see that as an additional patch on top, coming later, as this would need more discussion about how to exacly format the output. I'd really like to have that mtree mapinfo discussion tool merged _soon_.
Re: [Qemu-devel] [PATCH] hmp: Improve 'info mtree' with optional parm for mapinfo
Am 20.09.2016 um 02:51 schrieb Laszlo Ersek: On 09/20/16 02:16, Thorsten Kohfeldt wrote: Am 15.09.2016 um 11:52 schrieb Paolo Bonzini: On 07/09/2016 02:48, Thorsten Kohfeldt wrote: From: Thorsten Kohfeldt <mailto__qemu-de...@nongnu.org> Date: Wed, 31 Aug 2016 22:43:14 +0200 Subject: [PATCH] hmp: Improve 'info mtree' with optional parm for mapinfo Motivation When 'tuning' 'quirks' for VFIO imported devices, it is not easy to directly grasp the implications of the priorisation algorithms in place for the 'layered mapping' of memory regions. Even though there are rules (documented in docs/memory.txt), once in a while one might question the correctness of the actual implementation of the rules. Particularly, I believe I have uncovered a divergence of (sub-)region priorisation/order/visibility in a corner case of importing a device (which requires a 'quirk') with mmap enabled vs. mmap disabled. This modification provides a means of visualising the ACTUAL mapping/visibility/occlusion of subregions within regions, whereas the current info mtree command only lists the tree of regions (all, visible and invisible ones). It is primarily intended to provide support for easy presentation of my cause, but I strongly believe this modification also has general purpose advantages. It is a useful addition, but I think a simpler presentation is also fine. What about "info mtree -f" which would visit the FlatView instead of the tree? The patch would probably be much shorter. For quite some time I had the patch in use as a direct modification of 'info mtree', but I felt that a general purpose use must provide an ad hoc user selectable presentation width parameter for the map info. I personally use a width of 65 or even 129 characters PREFIXING the tree elements which the command currently responds. My guess is though that most users would want a width of only 9 or 17. So I believe that a numerical parameter is a must. Visit the flat view - I'm not sure I understand you. Do you suggest to traverse a completely different data structure ? The purpose of the suggested visualisation is to point out where each of the tree's memory regions are "pinched" by other regions, so their "native" contents is NOT visible any more throughout the full region length, but (fractionally) rather another regions's content. Thus, I personally require to traverse exactly that tree structure. No offence, but I would rather not want to modify the patch towards what I feel would be a completely different purpose. I would appreciate if someone would review the patch in its current functional form to get it queued for qemu 2.8. My intention is to be able to rely on communication partners being able to reproduce findings using the new command once I start to attack the VFIO mmap flaw I talk about in the commit comment. # PATCH (as published in mailing list) *CONVERSION* from qemu-2.6/qemu-2.7 to qemu-master: sed s/'[.]mhandler[.]cmd = '/'.cmd= '/ qemu-master.patch # patch commit adapted that way for qemu-master and frequently rebased in branch: https://github.com/Thorsten-Kohfeldt/qemu/commits/upstream-pullrequest-mapinfo-V1-rebased # sample output info mtree 9 https://gist.github.com/Thorsten-Kohfeldt/254b5a21fef497054959f58af53a44c9 # sample output info mtree 65 https://gist.github.com/Thorsten-Kohfeldt/39cf5c8521c1999518b3438315e439f4 I think your current output explains, for each part (that is, each "sample" > of user-selected size) of each MemoryRegion, whether that sample is actually > visible in the finally rendered, flat view of the address space(s). > (And, if not, why not.) If not why not: Right. That _is_ my goal. Whereas I think Paolo's idea is to map the flat view to begin with, > and visually associate each interval of the guest visible physical address > space with the one region that is ultimately accessed in that interval. > This too would explain what the guest sees where, and why. It wouldn't give > much information about ranges that the guest can *not* access (due to > occlusion by other regions or otherwise), > but maybe the "why not" is not so important after all? To the contrary IMHO. See my whole line of statements above. The "why not" explanation is all what drove me to attempt to make this patch an official part of qemu - as a tool for 'bugfix' discussions. Otherwise I'd have to state "apply this patch then you see what I mean". For me that "why not" is an important part of information from point of view of a "memory region subtree/stack maintainer". Please note that while hunting down that VFIO/mmap problem around last Xmas I _was_ experimenting with flat view dumps. But only the final mtree mapinfo was the real eye opener for me. Regarding the FlatView thing -- QEMU already maintains a flat rendering of > the memory region tree, so that guest accesses to
Re: [Qemu-devel] [PATCH] hmp: Improve 'info mtree' with optional parm for mapinfo
Am 15.09.2016 um 11:52 schrieb Paolo Bonzini: On 07/09/2016 02:48, Thorsten Kohfeldt wrote: From: Thorsten Kohfeldt <mailto__qemu-de...@nongnu.org> Date: Wed, 31 Aug 2016 22:43:14 +0200 Subject: [PATCH] hmp: Improve 'info mtree' with optional parm for mapinfo Motivation When 'tuning' 'quirks' for VFIO imported devices, it is not easy to directly grasp the implications of the priorisation algorithms in place for the 'layered mapping' of memory regions. Even though there are rules (documented in docs/memory.txt), once in a while one might question the correctness of the actual implementation of the rules. Particularly, I believe I have uncovered a divergence of (sub-)region priorisation/order/visibility in a corner case of importing a device (which requires a 'quirk') with mmap enabled vs. mmap disabled. This modification provides a means of visualising the ACTUAL mapping/visibility/occlusion of subregions within regions, whereas the current info mtree command only lists the tree of regions (all, visible and invisible ones). It is primarily intended to provide support for easy presentation of my cause, but I strongly believe this modification also has general purpose advantages. It is a useful addition, but I think a simpler presentation is also fine. What about "info mtree -f" which would visit the FlatView instead of the tree? The patch would probably be much shorter. Thanks, Paolo Paolo, For quite some time I had the patch in use as a direct modification of 'info mtree', but I felt that a general purpose use must provide an ad hoc user selectable presentation width parameter for the map info. I personally use a width of 65 or even 129 characters PREFIXING the tree elements which the command currently responds. My guess is though that most users would want a width of only 9 or 17. So I believe that a numerical parameter is a must. Visit the flat view - I'm not sure I understand you. Do you suggest to traverse a completely different data structure ? The purpose of the suggested visualisation is to point out where each of the tree's memory regions are "pinched" by other regions, so their "native" contents is NOT visible any more throughout the full region length, but (fractionally) rather another regions's content. Thus, I personally require to traverse exactly that tree structure. No offence, but I would rather not want to modify the patch towards what I feel would be a completely different purpose. I would appreciate if someone would review the patch in its current functional form to get it queued for qemu 2.8. My intention is to be able to rely on communication partners being able to reproduce findings using the new command once I start to attack the VFIO mmap flaw I talk about in the commit comment. @ALL I have provided 2 patch branches in github, one for qemu-2.7.0 and one for qemu-current-master (this needed a tiny sed-conversion, see below). I also placed some example info mtree mapinfo output on gist.github: # patch commit for qemu-2.7 (same patch also works for qemu-2.6): https://github.com/Thorsten-Kohfeldt/qemu/commit/5633a3cdbf6fd7cccd098fb83f591fbb15e8d383 # in branch: https://github.com/Thorsten-Kohfeldt/qemu/commits/monitor_--hmp_info_mtree_mapinfo-width # PATCH (as published in mailing list) *CONVERSION* from qemu-2.6/qemu-2.7 to qemu-master: sed s/'[.]mhandler[.]cmd = '/'.cmd= '/ qemu-master.patch # patch commit adapted that way for qemu-master: https://github.com/Thorsten-Kohfeldt/qemu/commit/60b8c1be1a0119df1f1859b0d484e06e709c2ea2 # in branch: https://github.com/Thorsten-Kohfeldt/qemu/commits/upstream-pullrequest-mapinfo-V1 # sample output info mtree 9 https://gist.github.com/Thorsten-Kohfeldt/254b5a21fef497054959f58af53a44c9 # sample output info mtree 65 https://gist.github.com/Thorsten-Kohfeldt/39cf5c8521c1999518b3438315e439f4 Regards, Thorsten
[Qemu-devel] [PATCH] hmp: Improve 'info mtree' with optional parm for mapinfo
From: Thorsten Kohfeldt <mailto__qemu-de...@nongnu.org> Date: Wed, 31 Aug 2016 22:43:14 +0200 Subject: [PATCH] hmp: Improve 'info mtree' with optional parm for mapinfo Motivation When 'tuning' 'quirks' for VFIO imported devices, it is not easy to directly grasp the implications of the priorisation algorithms in place for the 'layered mapping' of memory regions. Even though there are rules (documented in docs/memory.txt), once in a while one might question the correctness of the actual implementation of the rules. Particularly, I believe I have uncovered a divergence of (sub-)region priorisation/order/visibility in a corner case of importing a device (which requires a 'quirk') with mmap enabled vs. mmap disabled. This modification provides a means of visualising the ACTUAL mapping/visibility/occlusion of subregions within regions, whereas the current info mtree command only lists the tree of regions (all, visible and invisible ones). It is primarily intended to provide support for easy presentation of my cause, but I strongly believe this modification also has general purpose advantages. Functional implementation A new OPTIONAL integer parameter, mapinfo-width, is added to the monitor/hmp 'info mtree' command. Effect: When given and between 5 and 257, then each mtree output line is prefixed with columns of 'visibility samples', depicting what qemu's memory region priorisation algorithms effectively make visible/accessible at that sample address at the time of inquiry. NOTE that the sampling algorithm virtually cuts the memory region into (width - 1) 'slices' and computes (width) samples at the edges of those virtual slices. Thus, it is probably a good idea to always request (2^n + 1) samples. ALSO NOTE that the memory regions are NOT actually accessed at those 'samples', ONLY a region PRIORISATION EVALUATION is performed for the sample addresses. You can setup a default using environment variable QEMU_INFO_MTREE_MAPINFO_WIDTH (must be in the Qemu instance's environment; when unset, then the default is 0, i.e. off). Giving negative values for sample-width results in using that default, while values above 257 are reduced to 257, and values from 0 to 4 switch the sampling off. Technical implementation 3 functions are added to memory.c: sane_mtree_info_sample_width() is used to sanitise the new parameter, i.e. to provide a default and adjust it towards 'usability'. It is called by existing function mtree_info(). mtree_print_mr_v_samples() is called for each mtree memory region (mr) output line in order to print the 'mapinfo' prefix. The call is performed by existing function mtree_print_mr() with parameter 'const MemoryRegion *mr', thus promising the object under investigation is not modified. mtree_mr_sample_reftype_marker() is used to traverse an mr subtree for a given hardware address in order to basically find the first matching enabled region and return its type. It is called by mtree_print_mr_v_samples() for 'sample' addresses. As an mr tree is traversed, limited recursion is involved. Additionally, for existing functions memory_region_to_address_space(), memory_region_size(), and memory_region_find_rcu() their parameter 'MemoryRegion *mr' had to be constrained to 'const MemoryRegion *mr' in order to satisfy the compiler while attempting to emphasize the 'object is not modified' promise by insisting to pass *mr as const into mtree_print_mr_v_samples(). This did not entail changes in the bodies of mentioned functions. Signed-off-by: Thorsten Kohfeldt <mailto__qemu-de...@nongnu.org> --- hmp-commands-info.hx | 9 +- include/exec/memory.h | 7 +- memory.c | 246 -- monitor.c | 4 +- 4 files changed, 250 insertions(+), 16 deletions(-) diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx index 74446c6..1593238 100644 --- a/hmp-commands-info.hx +++ b/hmp-commands-info.hx @@ -264,16 +264,17 @@ ETEXI { .name = "mtree", -.args_type = "", -.params = "", -.help = "show memory tree", +.args_type = "mapinfo-width:l?", +.params = "[mapinfo-width]", +.help = "show memory tree " + "(mapinfo-width: depict memory subregion mappings with leading characters)", .mhandler.cmd = hmp_info_mtree, }, STEXI @item info mtree @findex mtree -Show memory tree. +Show memory tree optionally depicting subregion mappings. ETEXI { diff --git a/include/exec/memory.h b/include/exec/memory.h index 3e4d416..751483c 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -536,7 +536,7 @@ struct Object *memory_region_owner(MemoryRegion *mr); * * @mr: the memory region being queried. */ -uint64_t memory_region_size(MemoryRegion *mr); +uint64_t memory_region_size(const MemoryRegion *mr); /** * memo
Re: [Qemu-devel] [Bug 1384892] Re: RTL8168 NIC VFIO not working anymore since QEMU 2.1
Current QEMU code is quite different now. When I tested last (with QEMU 2.4) the problem still existed. I had quite a discussion with Alex up to and around end of 2015, but unfortunately since then I just did not have any spare time to convince Alex to accept what I call 'the real fix', a series of patches. I also could not test QEMU2.5 yet, but it does not *look* like it contains any additional fix towards the problem. Just a hint about the 'underlying' problem: Several subtypes of that card hang up DURING loading the assigned firmware, as QEMU does not correctly map all required i/o areas for supporting the firmware load (unless i/o mmap is disabled). The cards need to be power cycled then, reset is not enough. The correct mapping cannot really be derived from looking at the Linux driver code - it is rather to be deduced from access traces. If a firmware is NOT accessible, the card doesn't hang, but also it is running 'unpatched' i.e. might expose bugs that the hardware designer/manufacturer has detected and fixed via firmware. A better workaround is disabling i/o mmap when VFIO-attaching the card. This is supported by newer QEMU versions. So no, the problem is not fixed yet, but yes, I guess you can close this bug if you feel like it. Regards Am 22.06.2016 um 13:10 schrieb T. Huth: > Alex' patch has been included here: > http://git.qemu.org/?p=qemu.git;a=commitdiff;h=69970fcef937bddd7f745 > ... so I assume this ticket could be closed now? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1384892 Title: RTL8168 NIC VFIO not working anymore since QEMU 2.1 Status in QEMU: New Bug description: After upgrading QEMU from 2.0 to 2.1 (and libiscsi from 1.7.0 to 1.12 as a dependency) my two RTL8168 NICs stopped working. The NICs do not respond to any command and even the LEDs on the network connection turn off, a few seconds after the VM started. To get them back running I had to downgrade to 2.0 and restart the system. Unfortunately, I have no clue what to do or how to debug this problem since there are no specific errors logged. I tried two different VMs: Debian Wheezy and IPFire (see attachment for further details). The QEMU 2.1 changelog states "Support for RTL8168 NICs." so there were some major changes done, I guess. On the IPFire guest the kernel log shows many of these lines: r8169 :00:07.0 green1: rtl_eriar_cond == 1 (loop: 100, delay: 100) r8169 :00:07.0 green1: rtl_phy_reset_cond == 1 (loop: 100, delay: 1) On the Debian guest there is only: r8169 :00:07.0: firmware: agent loaded rtl_nic/rtl8168e-3.fw into memory r8169 :00:07.0: lan0: link down ADDRCONF(NETDEV_UP): lan0: link is not ready The commandline for IPFire can be seen in the attachment. It is the same for Debian. There are also the complete kernel logs for the working (2.0) and non-working (2.1) cases. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1384892/+subscriptions
[Qemu-devel] [Bug 1384892] Re: RTL8168 NIC VFIO not working anymore since QEMU 2.1
I made some time for limited testing. The released V2.1 puts the vfio-ed RTL8168 into a zombie state when running an IPFire VM, which requires a subsequent POWER cycle in order to let mii-tools show anything else than 'no link' (i.e. to get the LED back on). Pushing the reset button on the x64 metal was NOT enough. Then I binary-patched my V2.1 qemu x64 executable so it compares against a 'wrong' PCI ID inside vfio_probe_rtl8168_bar2_window_quirk() (to confirm comment #3). Yes, that made it work again. Further patch-attempts towards comment #10 all ended up in RTL zombie state (even though the IPFire VM was otherwise responsive). Each time a power cycle was required to get the RTL back alive. Could there be a general problem because the RTL must be usable in MSI-X mode for Windows and in in MSI mode for Linux ? Maybe a cmdline switch should be added to activate the quirk just when MSI-X is intended ? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1384892 Title: RTL8168 NIC VFIO not working anymore since QEMU 2.1 Status in QEMU: New Bug description: After upgrading QEMU from 2.0 to 2.1 (and libiscsi from 1.7.0 to 1.12 as a dependency) my two RTL8168 NICs stopped working. The NICs do not respond to any command and even the LEDs on the network connection turn off, a few seconds after the VM started. To get them back running I had to downgrade to 2.0 and restart the system. Unfortunately, I have no clue what to do or how to debug this problem since there are no specific errors logged. I tried two different VMs: Debian Wheezy and IPFire (see attachment for further details). The QEMU 2.1 changelog states Support for RTL8168 NICs. so there were some major changes done, I guess. On the IPFire guest the kernel log shows many of these lines: r8169 :00:07.0 green1: rtl_eriar_cond == 1 (loop: 100, delay: 100) r8169 :00:07.0 green1: rtl_phy_reset_cond == 1 (loop: 100, delay: 1) On the Debian guest there is only: r8169 :00:07.0: firmware: agent loaded rtl_nic/rtl8168e-3.fw into memory r8169 :00:07.0: lan0: link down ADDRCONF(NETDEV_UP): lan0: link is not ready The commandline for IPFire can be seen in the attachment. It is the same for Debian. There are also the complete kernel logs for the working (2.0) and non-working (2.1) cases. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1384892/+subscriptions
[Qemu-devel] [Bug 1384892] Re: RTL8168 NIC VFIO not working anymore since QEMU 2.1
Yes, thank you, that adapted patch looks good to me. It seems V2.4 based though, so it would need to be backported down to 2.3 through 2.1. Is there an established process for that kind of backporting need ? Do you confirm my 'not reached' hypothesis (which would explain why your patch #4 did not make a difference) ? Unfortunately I will not have time for testing during the next 2 weeks. Anyway, are you willing to shed some light on the reason why you stick to ex-or bit 31 in the 'fake-read' block ? I would appreciate some comment about that in the code. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1384892 Title: RTL8168 NIC VFIO not working anymore since QEMU 2.1 Status in QEMU: New Bug description: After upgrading QEMU from 2.0 to 2.1 (and libiscsi from 1.7.0 to 1.12 as a dependency) my two RTL8168 NICs stopped working. The NICs do not respond to any command and even the LEDs on the network connection turn off, a few seconds after the VM started. To get them back running I had to downgrade to 2.0 and restart the system. Unfortunately, I have no clue what to do or how to debug this problem since there are no specific errors logged. I tried two different VMs: Debian Wheezy and IPFire (see attachment for further details). The QEMU 2.1 changelog states Support for RTL8168 NICs. so there were some major changes done, I guess. On the IPFire guest the kernel log shows many of these lines: r8169 :00:07.0 green1: rtl_eriar_cond == 1 (loop: 100, delay: 100) r8169 :00:07.0 green1: rtl_phy_reset_cond == 1 (loop: 100, delay: 1) On the Debian guest there is only: r8169 :00:07.0: firmware: agent loaded rtl_nic/rtl8168e-3.fw into memory r8169 :00:07.0: lan0: link down ADDRCONF(NETDEV_UP): lan0: link is not ready The commandline for IPFire can be seen in the attachment. It is the same for Debian. There are also the complete kernel logs for the working (2.0) and non-working (2.1) cases. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1384892/+subscriptions
[Qemu-devel] [Bug 1384892] Re: RTL8168 NIC VFIO not working anymore since QEMU 2.1
After investigating a little more I have the impression, that a) Alex's patch #4 is required and b) 'val' does not need to be initialised Thus remains replacing each of the two instances of 0x1000U with 0x8000U, where it remains open whether the 'xor' operation (now on bit 31 rather than bit 27) in vfio_rtl8168_window_quirk_read() really creates the required result. I'd rather envision an 'or bit31' or an 'and ~bit31'. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1384892 Title: RTL8168 NIC VFIO not working anymore since QEMU 2.1 Status in QEMU: New Bug description: After upgrading QEMU from 2.0 to 2.1 (and libiscsi from 1.7.0 to 1.12 as a dependency) my two RTL8168 NICs stopped working. The NICs do not respond to any command and even the LEDs on the network connection turn off, a few seconds after the VM started. To get them back running I had to downgrade to 2.0 and restart the system. Unfortunately, I have no clue what to do or how to debug this problem since there are no specific errors logged. I tried two different VMs: Debian Wheezy and IPFire (see attachment for further details). The QEMU 2.1 changelog states Support for RTL8168 NICs. so there were some major changes done, I guess. On the IPFire guest the kernel log shows many of these lines: r8169 :00:07.0 green1: rtl_eriar_cond == 1 (loop: 100, delay: 100) r8169 :00:07.0 green1: rtl_phy_reset_cond == 1 (loop: 100, delay: 1) On the Debian guest there is only: r8169 :00:07.0: firmware: agent loaded rtl_nic/rtl8168e-3.fw into memory r8169 :00:07.0: lan0: link down ADDRCONF(NETDEV_UP): lan0: link is not ready The commandline for IPFire can be seen in the attachment. It is the same for Debian. There are also the complete kernel logs for the working (2.0) and non-working (2.1) cases. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1384892/+subscriptions
[Qemu-devel] [Bug 1384892] Re: RTL8168 NIC VFIO not working anymore since QEMU 2.1
Alex, are you sure about the constant 0x1000U (bit 27) being used in the code below ? Shouldn't it rather be a 0x8000U (bit 31 which you commented about) ? I added a /* NOT REACHED ? */ below, as I feel that might be the problem. Florian, are you willing to test the so modified source with and without Alex's patch of comment #4 ? Note that the constant 0x1000U is also used for ex-or in the function vfio_rtl8168_window_quirk_read(), where it should (I think) also be 0x8000U (or maybe just 0, i.e. no modification of the saved value ?) AND: Shouldn't variable uint64_t val; in function vfio_rtl8168_window_quirk_read() be initialised 0, as it is size 8, which is larger than size (which I gather is 4) ? Anyway, just to keep everybody updated about newer versions of QEMU: The relevant code blocks seem to have moved to hw/vfio/pci.c in QEMU 2.3. static void vfio_rtl8168_window_quirk_write(void *opaque, hwaddr addr, uint64_t data, unsigned size) { VFIOQuirk *quirk = opaque; VFIODevice *vdev = quirk-vdev; switch (addr) { case 4: /* address */ if ((data 0x7fff) == 0x1) { if (data 0x1000U vdev-pdev.cap_present QEMU_PCI_CAP_MSIX) { /* NOT REACHED ? */ DPRINTF(%s MSI-X table write(%04x:%02x:%02x.%d)\n, memory_region_name(quirk-mem), vdev-host.domain, vdev-host.bus, vdev-host.slot, vdev-host.function); io_mem_write(vdev-pdev.msix_table_mmio, (hwaddr)(quirk-data.address_match 0xfff), data, size); } quirk-data.flags = 1; quirk-data.address_match = data; return; } -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1384892 Title: RTL8168 NIC VFIO not working anymore since QEMU 2.1 Status in QEMU: New Bug description: After upgrading QEMU from 2.0 to 2.1 (and libiscsi from 1.7.0 to 1.12 as a dependency) my two RTL8168 NICs stopped working. The NICs do not respond to any command and even the LEDs on the network connection turn off, a few seconds after the VM started. To get them back running I had to downgrade to 2.0 and restart the system. Unfortunately, I have no clue what to do or how to debug this problem since there are no specific errors logged. I tried two different VMs: Debian Wheezy and IPFire (see attachment for further details). The QEMU 2.1 changelog states Support for RTL8168 NICs. so there were some major changes done, I guess. On the IPFire guest the kernel log shows many of these lines: r8169 :00:07.0 green1: rtl_eriar_cond == 1 (loop: 100, delay: 100) r8169 :00:07.0 green1: rtl_phy_reset_cond == 1 (loop: 100, delay: 1) On the Debian guest there is only: r8169 :00:07.0: firmware: agent loaded rtl_nic/rtl8168e-3.fw into memory r8169 :00:07.0: lan0: link down ADDRCONF(NETDEV_UP): lan0: link is not ready The commandline for IPFire can be seen in the attachment. It is the same for Debian. There are also the complete kernel logs for the working (2.0) and non-working (2.1) cases. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1384892/+subscriptions