Re: [PATCH 0/2] hw/block/nvme: handle transient dma errors

2020-07-03 Thread Philippe Mathieu-Daudé
On 7/3/20 9:50 AM, Kevin Wolf wrote:
> Am 01.07.2020 um 14:58 hat Philippe Mathieu-Daudé geschrieben:
>> On 6/29/20 11:34 PM, Klaus Jensen wrote:
>>> On Jun 29 14:07, no-re...@patchew.org wrote:
 Patchew URL: 
 https://patchew.org/QEMU/20200629202053.1223342-1-...@irrelevant.dk/
>>
 --- /tmp/qemu-test/src/tests/qemu-iotests/040.out   2020-06-29 
 20:12:10.0 +
 +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 
 20:58:48.288790818 +
 @@ -1,3 +1,5 @@
 +WARNING:qemu.machine:qemu received signal 9: 
 /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
  -display none -vga none -chardev 
 socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon 
 chardev=mon,mode=control -qtest 
 unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest 
 -nodefaults -display none -accel qtest
 +WARNING:qemu.machine:qemu received signal 9: 
 /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
  -display none -vga none -chardev 
 socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon 
 chardev=mon,mode=control -qtest 
 unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest 
 -nodefaults -display none -accel qtest
>>
>> Kevin, Max, can iotests/040 be affected by this change?
> 
> The diffstat of this series looks like it doesn't touch anything outside
> of the nvme emuation, which isn't used by this test, so at least I'd say
> it's not the fault of the patch series.
> 
> I think test cases use SIGKILL primarily in timeout handlers, so maybe
> the test host was overloaded and didn't shutdown QEMU in time so it was
> killed. There is no actually failing test case:
> 
>  ...
>  --
>  Ran 59 tests
> 
> You would have 'F' or 'E' for fail/error instead of '.' otherwise.

TIL how to read that line :)

Thanks for your analysis Kevin!

> 
> Kevin
> 
>>>
>>>
>>> Hmm, I can't seem to reproduce this locally and the test succeeded on
>>> the next series[1] that is based on this.
>>>
>>> Is this a flaky test? Or a bad test runner? I'm of course worried when
>>> a qcow2 test fails and I touch something else than the nvme device ;)
>>>
>>>
>>>   [1]: https://patchew.org/QEMU/20200629203155.1236860-1-...@irrelevant.dk/
>>>
>>
> 




Re: [PATCH 0/2] hw/block/nvme: handle transient dma errors

2020-07-03 Thread Kevin Wolf
Am 01.07.2020 um 14:58 hat Philippe Mathieu-Daudé geschrieben:
> On 6/29/20 11:34 PM, Klaus Jensen wrote:
> > On Jun 29 14:07, no-re...@patchew.org wrote:
> >> Patchew URL: 
> >> https://patchew.org/QEMU/20200629202053.1223342-1-...@irrelevant.dk/
> 
> >> --- /tmp/qemu-test/src/tests/qemu-iotests/040.out   2020-06-29 
> >> 20:12:10.0 +
> >> +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 
> >> 20:58:48.288790818 +
> >> @@ -1,3 +1,5 @@
> >> +WARNING:qemu.machine:qemu received signal 9: 
> >> /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
> >>  -display none -vga none -chardev 
> >> socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon 
> >> chardev=mon,mode=control -qtest 
> >> unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest 
> >> -nodefaults -display none -accel qtest
> >> +WARNING:qemu.machine:qemu received signal 9: 
> >> /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
> >>  -display none -vga none -chardev 
> >> socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon 
> >> chardev=mon,mode=control -qtest 
> >> unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest 
> >> -nodefaults -display none -accel qtest
> 
> Kevin, Max, can iotests/040 be affected by this change?

The diffstat of this series looks like it doesn't touch anything outside
of the nvme emuation, which isn't used by this test, so at least I'd say
it's not the fault of the patch series.

I think test cases use SIGKILL primarily in timeout handlers, so maybe
the test host was overloaded and didn't shutdown QEMU in time so it was
killed. There is no actually failing test case:

 ...
 --
 Ran 59 tests

You would have 'F' or 'E' for fail/error instead of '.' otherwise.

Kevin

> > 
> > 
> > Hmm, I can't seem to reproduce this locally and the test succeeded on
> > the next series[1] that is based on this.
> > 
> > Is this a flaky test? Or a bad test runner? I'm of course worried when
> > a qcow2 test fails and I touch something else than the nvme device ;)
> > 
> > 
> >   [1]: https://patchew.org/QEMU/20200629203155.1236860-1-...@irrelevant.dk/
> > 
> 




Re: [PATCH 0/2] hw/block/nvme: handle transient dma errors

2020-07-01 Thread Philippe Mathieu-Daudé
On 6/29/20 11:34 PM, Klaus Jensen wrote:
> On Jun 29 14:07, no-re...@patchew.org wrote:
>> Patchew URL: 
>> https://patchew.org/QEMU/20200629202053.1223342-1-...@irrelevant.dk/

>> --- /tmp/qemu-test/src/tests/qemu-iotests/040.out   2020-06-29 
>> 20:12:10.0 +
>> +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 
>> 20:58:48.288790818 +
>> @@ -1,3 +1,5 @@
>> +WARNING:qemu.machine:qemu received signal 9: 
>> /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
>>  -display none -vga none -chardev 
>> socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon 
>> chardev=mon,mode=control -qtest 
>> unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults 
>> -display none -accel qtest
>> +WARNING:qemu.machine:qemu received signal 9: 
>> /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
>>  -display none -vga none -chardev 
>> socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon 
>> chardev=mon,mode=control -qtest 
>> unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults 
>> -display none -accel qtest

Kevin, Max, can iotests/040 be affected by this change?

> 
> 
> Hmm, I can't seem to reproduce this locally and the test succeeded on
> the next series[1] that is based on this.
> 
> Is this a flaky test? Or a bad test runner? I'm of course worried when
> a qcow2 test fails and I touch something else than the nvme device ;)
> 
> 
>   [1]: https://patchew.org/QEMU/20200629203155.1236860-1-...@irrelevant.dk/
> 




Re: [PATCH 0/2] hw/block/nvme: handle transient dma errors

2020-06-29 Thread Klaus Jensen
On Jun 29 14:07, no-re...@patchew.org wrote:
> Patchew URL: 
> https://patchew.org/QEMU/20200629202053.1223342-1-...@irrelevant.dk/
> 
> 
> 
> Hi,
> 
> This series failed the docker-quick@centos7 build test. Please find the 
> testing commands and
> their output below. If you have Docker installed, you can probably reproduce 
> it
> locally.
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> make docker-image-centos7 V=1 NETWORK=1
> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
> === TEST SCRIPT END ===
> 
> --- /tmp/qemu-test/src/tests/qemu-iotests/040.out   2020-06-29 
> 20:12:10.0 +
> +++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 
> 20:58:48.288790818 +
> @@ -1,3 +1,5 @@
> +WARNING:qemu.machine:qemu received signal 9: 
> /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
>  -display none -vga none -chardev 
> socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon 
> chardev=mon,mode=control -qtest 
> unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults 
> -display none -accel qtest
> +WARNING:qemu.machine:qemu received signal 9: 
> /tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64
>  -display none -vga none -chardev 
> socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon 
> chardev=mon,mode=control -qtest 
> unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults 
> -display none -accel qtest


Hmm, I can't seem to reproduce this locally and the test succeeded on
the next series[1] that is based on this.

Is this a flaky test? Or a bad test runner? I'm of course worried when
a qcow2 test fails and I touch something else than the nvme device ;)


  [1]: https://patchew.org/QEMU/20200629203155.1236860-1-...@irrelevant.dk/



Re: [PATCH 0/2] hw/block/nvme: handle transient dma errors

2020-06-29 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20200629202053.1223342-1-...@irrelevant.dk/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

--- /tmp/qemu-test/src/tests/qemu-iotests/040.out   2020-06-29 
20:12:10.0 +
+++ /tmp/qemu-test/build/tests/qemu-iotests/040.out.bad 2020-06-29 
20:58:48.288790818 +
@@ -1,3 +1,5 @@
+WARNING:qemu.machine:qemu received signal 9: 
/tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 
-display none -vga none -chardev 
socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon 
chardev=mon,mode=control -qtest 
unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults 
-display none -accel qtest
+WARNING:qemu.machine:qemu received signal 9: 
/tmp/qemu-test/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 
-display none -vga none -chardev 
socket,id=mon,path=/tmp/tmp.Jdol0fPScQ/qemu-21749-monitor.sock -mon 
chardev=mon,mode=control -qtest 
unix:path=/tmp/tmp.Jdol0fPScQ/qemu-21749-qtest.sock -accel qtest -nodefaults 
-display none -accel qtest
 ...
 --
 Ran 59 tests
---
Not run: 259
Failures: 040
Failed 1 of 119 iotests
make: *** [check-tests/check-block.sh] Error 1
make: *** Waiting for unfinished jobs
  TESTcheck-qtest-aarch64: tests/qtest/qos-test
Traceback (most recent call last):
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=da25eaa8bdd04cb783e2c427c6a5aa94', '-u', 
'1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-98l7koy2/src/docker-src.2020-06-29-16.51.46.20742:/var/tmp/qemu:z,ro',
 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=da25eaa8bdd04cb783e2c427c6a5aa94
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-98l7koy2/src'
make: *** [docker-run-test-quick@centos7] Error 2

real15m57.590s
user0m9.240s


The full log is available at
http://patchew.org/logs/20200629202053.1223342-1-...@irrelevant.dk/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[PATCH 0/2] hw/block/nvme: handle transient dma errors

2020-06-29 Thread Klaus Jensen
From: Klaus Jensen 

QEMU actually respects that Bus Master Enabling for a PCI device gets
flipped, so in order to succesfully pass the block/011 test ("disable
PCI device while doing I/O") the nvme device needs to know if a dma
transfer was successful or not.

Based-on: <20200629195017.1217056-1-...@irrelevant.dk>
("[PATCH 00/17] hw/block/nvme: AIO and address mapping refactoring")

Klaus Jensen (2):
  pci: pass along the return value of dma_memory_rw
  hw/block/nvme: handle dma errors

 hw/block/nvme.c   | 43 ---
 hw/block/trace-events |  2 ++
 include/block/nvme.h  |  2 +-
 include/hw/pci/pci.h  |  3 +--
 4 files changed, 36 insertions(+), 14 deletions(-)

-- 
2.27.0