Re: [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest
On 01/19/2017 05:43 AM, Alex Williamson wrote: > On Sat, 31 Dec 2016 17:13:04 +0800 > How can you know if he other function was affected if you don't even > have a cable connected? Will try ask for another cable for the other function. > How is testing on something that doesn't seem > to work correctly already valid? Thanks, Not sure I understand you right. Even if I find those abnormal phenomenon in test, I finally got the very exact result I expected in each scenario, does it enough to prove this new functionality works? and especially these abnormal phenomenon exists even without this patchset -- Sincerely, Cao jin
Re: [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest
On 01/19/2017 05:43 AM, Alex Williamson wrote: > On Sat, 31 Dec 2016 17:13:04 +0800 > Cao jinwrote: > >> As previous discussion suggest, we could take a step back to handle non-fatal >> error first, this will make this patchset much more thinner, because we could >> drop all the configuration restriction related patches. >> >> FYI: patch 1 has been cherry picked into another series, and wait to be >> merged >> first, so this patchset can't compile in your host. >> >> v11 changelog: >> 1. drop a bunch of code which check the configuration. >> 2. modify patch 3 to handle non-fatal error only, fatal error still >>results in vm stop. >>Doesn't modify as suggestion "add another eventfd do distinguish fatal & >>non-fatal error", because 1st, user has the ability to distinguish them >>just from the uncorrectable error status; 2nd, for back compatible, e.g. >>an old user handle both error, rely on the current error eventfd. >> >> Test: >> Test it with intel 82576 NIC, which has 2 functions, function 1 has cable >> connected to gateway, function 0 has no link. Test in 4 scenario. >> 1. just assign function 1 to one vm, function 0 has no user >> 2. assign 2 function to one vm, totally comply previous configuraton >> restrction >> 3. assign 2 function to one vm, under different virtual bus >> 4, assign functions to 2 different vm >> >> The test steps are the same as v10: assign ip to function 1, add route info, >> and ping the gateway. The results meet expectation. But the unsteady hardware >> often emit fatal error, still don't know why. And igb driver in guest seems >> has bug: ping gateway for a while, even if I don't do anything, it will show >> "Destination Host Unreachable" after many successful ping. But obviously, >> neither of these has relationship with this patchset. > > So something doesn't work right regardless and this doesn't describe > what testing was done of the functionality added here. Were AER events > injected? Did fatal ones cause a vmstop, did non-fatal ones continue? > How can you know if he other function was affected if you don't even > have a cable connected? How is testing on something that doesn't seem > to work correctly already valid? Thanks, > > Alex > Well, of course I test the functionality added here via aer-inject tool & driver, the same test as I did in v10. The unsteady hardware slows my test, make me did the test again & again & again, until see what I expected, maybe I should share some details. General test steps(linux only) * assign ip to function 1 in guest, add route info, and ping the gateway * on host, injecting various non fatal error to both devices. Robustness test is: fast repeat injecting manually. Expectation: in all scenario, function 1 who is pinging still works after non fatal error recovery; function 0(in host, same guest, another guest) doesn't have any abnormal phenomenon(crash guest, or any abnormal log in dmesg); fatal error cause a vmstop(surely) I did the test in each scenario, and I got what I expected: fatal error definitely cause a vmstop; even bothered by unsteady hardware, non fatal error recovery is ok IIRC, in test of v10, I even find that the NIC would emit the fatal error after I reboot the host, before I start vm. This is what I called unsteady hardware(I will try to figure out on which scenario the hardware is easy to be unsteady). So I think, something doesn't work right, but in my opinion, none of this patchset's business. -- Sincerely, Cao jin >> Chen Fan (3): >> vfio: new function to init aer cap for vfio device >> vfio-pci: pass the aer error to guest >> vfio: add 'aer' property to expose aercap >> >> Dou Liyang (1): >> pcie_aer: support configurable AER capa version >> >> hw/net/e1000e.c| 3 +- >> hw/pci-bridge/ioh3420.c| 2 +- >> hw/pci-bridge/xio3130_downstream.c | 2 +- >> hw/pci-bridge/xio3130_upstream.c | 2 +- >> hw/pci/pcie_aer.c | 5 +- >> hw/vfio/pci.c | 139 >> + >> hw/vfio/pci.h | 3 + >> include/hw/pci/pcie_aer.h | 3 +- >> 8 files changed, 139 insertions(+), 20 deletions(-) >> > > > > . >
Re: [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest
On Sat, 31 Dec 2016 17:13:04 +0800 Cao jinwrote: > As previous discussion suggest, we could take a step back to handle non-fatal > error first, this will make this patchset much more thinner, because we could > drop all the configuration restriction related patches. > > FYI: patch 1 has been cherry picked into another series, and wait to be merged > first, so this patchset can't compile in your host. > > v11 changelog: > 1. drop a bunch of code which check the configuration. > 2. modify patch 3 to handle non-fatal error only, fatal error still >results in vm stop. >Doesn't modify as suggestion "add another eventfd do distinguish fatal & >non-fatal error", because 1st, user has the ability to distinguish them >just from the uncorrectable error status; 2nd, for back compatible, e.g. >an old user handle both error, rely on the current error eventfd. > > Test: > Test it with intel 82576 NIC, which has 2 functions, function 1 has cable > connected to gateway, function 0 has no link. Test in 4 scenario. > 1. just assign function 1 to one vm, function 0 has no user > 2. assign 2 function to one vm, totally comply previous configuraton > restrction > 3. assign 2 function to one vm, under different virtual bus > 4, assign functions to 2 different vm > > The test steps are the same as v10: assign ip to function 1, add route info, > and ping the gateway. The results meet expectation. But the unsteady hardware > often emit fatal error, still don't know why. And igb driver in guest seems > has bug: ping gateway for a while, even if I don't do anything, it will show > "Destination Host Unreachable" after many successful ping. But obviously, > neither of these has relationship with this patchset. So something doesn't work right regardless and this doesn't describe what testing was done of the functionality added here. Were AER events injected? Did fatal ones cause a vmstop, did non-fatal ones continue? How can you know if he other function was affected if you don't even have a cable connected? How is testing on something that doesn't seem to work correctly already valid? Thanks, Alex > Chen Fan (3): > vfio: new function to init aer cap for vfio device > vfio-pci: pass the aer error to guest > vfio: add 'aer' property to expose aercap > > Dou Liyang (1): > pcie_aer: support configurable AER capa version > > hw/net/e1000e.c| 3 +- > hw/pci-bridge/ioh3420.c| 2 +- > hw/pci-bridge/xio3130_downstream.c | 2 +- > hw/pci-bridge/xio3130_upstream.c | 2 +- > hw/pci/pcie_aer.c | 5 +- > hw/vfio/pci.c | 139 > + > hw/vfio/pci.h | 3 + > include/hw/pci/pcie_aer.h | 3 +- > 8 files changed, 139 insertions(+), 20 deletions(-) >
Re: [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest
Hi, Your series seems to have some coding style problems. See output below for more information: Message-id: 1483175588-17006-1-git-send-email-caoj.f...@cn.fujitsu.com Type: series Subject: [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest === TEST SCRIPT BEGIN === #!/bin/bash BASE=base n=1 total=$(git log --oneline $BASE.. | wc -l) failed=0 # Useful git options git config --local diff.renamelimit 0 git config --local diff.renames True commits="$(git log --format=%H --reverse $BASE..)" for c in $commits; do echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then failed=1 echo fi n=$((n+1)) done exit $failed === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/1483175588-17006-1-git-send-email-caoj.f...@cn.fujitsu.com -> patchew/1483175588-17006-1-git-send-email-caoj.f...@cn.fujitsu.com Switched to a new branch 'test' 7f7bb75 vfio: add 'aer' property to expose aercap a62d474 vfio-pci: pass the aer error to guest 6678558 vfio: new function to init aer cap for vfio device 6e3e08c pcie_aer: support configurable AER capa version === OUTPUT BEGIN === Checking PATCH 1/4: pcie_aer: support configurable AER capa version... Checking PATCH 2/4: vfio: new function to init aer cap for vfio device... Checking PATCH 3/4: vfio-pci: pass the aer error to guest... ERROR: do not use C99 // comments #44: FILE: hw/vfio/pci.c:2486: +return; //Or goto stop? ERROR: code indent should never use tabs #57: FILE: hw/vfio/pci.c:2499: +^Iif (isfatal) {$ ERROR: code indent should never use tabs #58: FILE: hw/vfio/pci.c:2500: +^Igoto stop;$ ERROR: code indent should never use tabs #59: FILE: hw/vfio/pci.c:2501: +^I}$ total: 4 errors, 0 warnings, 63 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. Checking PATCH 4/4: vfio: add 'aer' property to expose aercap... === OUTPUT END === Test command exited with code: 1 --- Email generated automatically by Patchew [http://patchew.org/]. Please send your feedback to patchew-de...@freelists.org
[Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest
As previous discussion suggest, we could take a step back to handle non-fatal error first, this will make this patchset much more thinner, because we could drop all the configuration restriction related patches. FYI: patch 1 has been cherry picked into another series, and wait to be merged first, so this patchset can't compile in your host. v11 changelog: 1. drop a bunch of code which check the configuration. 2. modify patch 3 to handle non-fatal error only, fatal error still results in vm stop. Doesn't modify as suggestion "add another eventfd do distinguish fatal & non-fatal error", because 1st, user has the ability to distinguish them just from the uncorrectable error status; 2nd, for back compatible, e.g. an old user handle both error, rely on the current error eventfd. Test: Test it with intel 82576 NIC, which has 2 functions, function 1 has cable connected to gateway, function 0 has no link. Test in 4 scenario. 1. just assign function 1 to one vm, function 0 has no user 2. assign 2 function to one vm, totally comply previous configuraton restrction 3. assign 2 function to one vm, under different virtual bus 4, assign functions to 2 different vm The test steps are the same as v10: assign ip to function 1, add route info, and ping the gateway. The results meet expectation. But the unsteady hardware often emit fatal error, still don't know why. And igb driver in guest seems has bug: ping gateway for a while, even if I don't do anything, it will show "Destination Host Unreachable" after many successful ping. But obviously, neither of these has relationship with this patchset. Chen Fan (3): vfio: new function to init aer cap for vfio device vfio-pci: pass the aer error to guest vfio: add 'aer' property to expose aercap Dou Liyang (1): pcie_aer: support configurable AER capa version hw/net/e1000e.c| 3 +- hw/pci-bridge/ioh3420.c| 2 +- hw/pci-bridge/xio3130_downstream.c | 2 +- hw/pci-bridge/xio3130_upstream.c | 2 +- hw/pci/pcie_aer.c | 5 +- hw/vfio/pci.c | 139 + hw/vfio/pci.h | 3 + include/hw/pci/pcie_aer.h | 3 +- 8 files changed, 139 insertions(+), 20 deletions(-) -- 1.8.3.1