[ovirt-devel] Re: [VDSM] all test passed, build failed with "tar: journal/b087148aba6d49b9bbef488e52a48752/system.journal: file changed as we read it"
On Sat, Dec 1, 2018 at 11:10 PM Nir Soffer wrote: > On Thu, Nov 29, 2018 at 11:21 AM Edward Haas wrote: > >> >> >> On Thu, Nov 29, 2018 at 10:41 AM Edward Haas wrote: >> >>> >>> >>> On Wed, Nov 28, 2018 at 8:12 PM Nir Soffer wrote: >>> We have this failure that pops randomly: 1. All tests pass *00:13:13.284* ___ summary *00:13:13.285* tests: commands succeeded*00:13:13.286* storage-py27: commands succeeded*00:13:13.286* storage-py36: commands succeeded*00:13:13.286* lib-py27: commands succeeded*00:13:13.287* lib-py36: commands succeeded*00:13:13.288* network-py27: commands succeeded*00:13:13.290* network-py36: commands succeeded*00:13:13.291* virt-py27: commands succeeded*00:13:13.292* virt-py36: commands succeeded*00:13:13.293* congratulations :) 2. But we fail to collect logs at the end *00:14:35.992* ##*00:14:35.995* ## Wed Nov 28 17:39:50 UTC 2018 Finished env: fc28:fedora-28-x86_64*00:14:35.996* ## took 764 seconds*00:14:35.997* ## rc = 1*00:14:35.997* ##*00:14:36.009* ##! ERROR v*00:14:36.010* ##! Last 20 log entries: /tmp/mock_logs.Lcop4ZOq/script/stdout_stderr.log*00:14:36.011* ##!*00:14:36.012* journal/b087148aba6d49b9bbef488e52a48752/system.journal*00:14:36.013* tar: journal/b087148aba6d49b9bbef488e52a48752/system.journal: file changed as we read it*00:14:36.014* journal/b087148aba6d49b9bbef488e52a48752/user-1000.journal*00:14:36.015* lastlog*00:14:36.015* libvirt/*00:14:36.015* libvirt/lxc/*00:14:36.015* libvirt/libxl/*00:14:36.016* libvirt/qemu/*00:14:36.016* libvirt/qemu/LiveOS-f920001d-be4e-47ea-ac26-72480fd5be87.log*00:14:36.017* libvirt/uml/*00:14:36.017* ovirt-guest-agent/*00:14:36.017* ovirt-guest-agent/ovirt-guest-agent.log*00:14:36.017* README*00:14:36.018* samba/*00:14:36.018* samba/old/*00:14:36.018* sssd/*00:14:36.018* tallylog*00:14:36.018* wtmp*00:14:36.018* Took 678 seconds*00:14:36.018* ===*00:14:36.019* ##!*00:14:36.019* ##! ERROR ^^*00:14:36.019* ##! This looks like an issue with vdsm check-patch.sh: function collect_logs { res=$? [ "$res" -ne 0 ] && echo "*** err: $res" cd /var/log tar -cvzf "$EXPORT_DIR/mock_varlogs.tar.gz" * cd /var/host_log tar -cvzf "$EXPORT_DIR/host_varlogs.tar.gz" * } trap collect_logs EXIT Seems that tar fail to collect log if the log is modified while copied, which makes sense. We can ignore errors in tar, since log collection should not fail the build, but I think a better solution is to avoid collecting any logs since vdsm writes its own logs during tests - all the info must be in vdsm log. Here is the list of collected logs: *00:13:47.280* + tar -cvzf /home/jenkins/workspace/vdsm_master_check-patch-fc28-x86_64/vdsm/exported-artifacts/mock_varlogs.tar.gz btmp dnf.librepo.log dnf.log dnf.rpm.log faillog glusterfs hawkey.log journal lastlog libvirt openvswitch README tallylog vdsm_tests.log wtmp yum.log*00:13:47.285* btmp*00:13:47.285* dnf.librepo.log*00:13:47.299* dnf.log*00:13:47.309* dnf.rpm.log*00:13:47.310* faillog*00:13:47.311* glusterfs/*00:13:47.312* hawkey.log*00:13:47.313* journal/*00:13:47.313* lastlog*00:13:47.315* libvirt/*00:13:47.315* libvirt/qemu/*00:13:47.316* openvswitch/*00:13:47.317* openvswitch/ovs-vswitchd.log*00:13:47.318* openvswitch/ovsdb-server.log*00:13:47.319* README*00:13:47.320* tallylog*00:13:47.321* vdsm_tests.log*00:13:47.342* wtmp*00:13:47.343* yum.log*00:13:47.349* + cd /var/host_log*00:13:47.350* + tar -cvzf /home/jenkins/workspace/vdsm_master_check-patch-fc28-x86_64/vdsm/exported-artifacts/host_varlogs.tar.gz anaconda audit boot.log btmp chrony cloud-init.log cloud-init-output.log cron dnf.librepo.log dnf.log dnf.rpm.log firewalld glusterfs hawkey.log journal lastlog libvirt ovirt-guest-agent README samba sssd tallylog wtmp*00:13:47.356* anaconda/*00:13:47.356* anaconda/ifcfg.log*00:13:47.357* anaconda/ks-script-l5qnynnj.log*00:13:47.358* anaconda/storage.log*00:13:47.359* anaconda/program.log*00:13:47.395* anaconda/ks-script-b5_08tmo.log*00:13:47.396* anaconda/ks-script-6uks8bp3.log*00:13:47.397* anaconda/hawkey.log*00:13:47.398* anaconda/syslog*00:13:47.406* anaconda/journal.log*00:13:47.449*
[ovirt-devel] Re: [VDSM] test_echo(1024, False) (stomp_test.StompTests) fails
On Wed, Nov 7, 2018 at 2:20 PM Piotr Kliczewski wrote: > > > On Wed, Nov 7, 2018 at 1:16 PM Nir Soffer wrote: > >> On Wed, Nov 7, 2018 at 10:13 AM Piotr Kliczewski >> wrote: >> >>> >>> >>> On Wed, Nov 7, 2018 at 5:21 AM Germano Veit Michel >>> wrote: >>> On Sat, Jun 30, 2018 at 12:43 AM Piotr Kliczewski < piotr.kliczew...@gmail.com> wrote: > Here is the patch [1] which should fix it. > > [1] https://gerrit.ovirt.org/#/c/92690/ > On Thu, Jun 28, 2018 at 2:44 PM Piotr Kliczewski > wrote: > > > > Nir, > > > > It looks like we have a race condition where request arrive sooner > than subscribe frame: > > > > 16:19:02 2018-06-27 14:17:54,653 INFO (jsonrpc/0) > [jsonrpc.JsonRpcServer] RPC call echo succeeded in 0.01 seconds > (__init__:311) > > 16:19:02 2018-06-27 14:17:54,661 INFO (Detector thread) > [Broker.StompAdapter] Subscribe command received (stompserver:123) > > > > Thanks for reporting I will push a patch to fix it. > > > > Thanks, > > Piotr > > > > > > On Wed, Jun 27, 2018 at 5:40 PM, Piotr Kliczewski < > pklic...@redhat.com> wrote: > >> > >> Ok, I'll take a look. > >> > >> śr., 27 cze 2018, 17:36 użytkownik Nir Soffer > napisał: > >>> > >>> On Wed, Jun 27, 2018 at 6:13 PM Piotr Kliczewski < > pklic...@redhat.com> wrote: > > On Wed, Jun 27, 2018 at 5:01 PM, Nir Soffer > wrote: > > > > This tests used to fail in the past, but since we fixed it or > the code > > > > it never failed. > > > > > > Maybe the slave was overloaded? > > > Very possible. Can you paste a link to the job which failed? > >>> > >>> > >>> Here: > https://jenkins.ovirt.org/job/vdsm_master_check-patch-el7-x86_64/24130/ > >>> > >>> The next build passed. > >>> > >>> Maybe we need to solve the flakiness of some tests by running a > flaky test > >>> again, and let the build fail only if a test failed twice. I > wonder if there is some > >>> pytest plugin doing this. > >>> > > > > > > > > 14:19:02 > == > > 14:19:02 ERROR: > > > > 14:19:02 > -- > > 14:19:02 Traceback (most recent call last): > > 14:19:02 File > "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/tests/testlib.py", > line 143, in wrapper > > 14:19:02 return f(self, *args) > > 14:19:02 File > "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/tests/stomp_test.py", > line 95, in test_echo > > 14:19:02 str(uuid4())), > > 14:19:02 File > "/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/lib/yajsonrpc/jsonrpcclient.py", > line 77, in callMethod > > 14:19:02 raise > exception.JsonRpcNoResponseError(method=methodName) > > 14:19:02 JsonRpcNoResponseError: No response for JSON-RPC > request: {'method': 'echo'} > > 14:19:02 >> begin captured logging << > > > 14:19:02 2018-06-27 14:17:54,524 DEBUG (MainThread) > [vds.MultiProtocolAcceptor] Creating socket (host='::1', port=0, > family=10, > socketype=1, proto=6) (protocoldetector:225) > > 14:19:02 2018-06-27 14:17:54,526 INFO (MainThread) > [vds.MultiProtocolAcceptor] Listening at ::1:36713 (protocoldetector:183) > > 14:19:02 2018-06-27 14:17:54,535 DEBUG (MainThread) [Scheduler] > Starting scheduler test.Scheduler (schedule:98) > > 14:19:02 2018-06-27 14:17:54,537 DEBUG (test.Scheduler) > [Scheduler] START thread 140562082535168)> (func= >, args=(), kwargs={}) > (concurrent:193) > > 14:19:02 2018-06-27 14:17:54,538 DEBUG (test.Scheduler) > [Scheduler] started (schedule:140) > > 14:19:02 2018-06-27 14:17:54,546 DEBUG (JsonRpc (StompReactor)) > [root] START thread 140562629256960)> (func= >, args=(), > kwargs={}) (concurrent:193) > > 14:19:02 2018-06-27 14:17:54,547 DEBUG (MainThread) [Executor] > Starting executor (executor:128) > > 14:19:02 2018-06-27 14:17:54,549 DEBUG (MainThread) [Executor] > Starting worker jsonrpc/0 (executor:286) > > 14:19:02 2018-06-27 14:17:54,553 DEBUG (jsonrpc/0) [Executor] > START thread > (func= at 0x7fd74ca00650>>, args=(), kwargs={}) (concurrent:193) > > 14:19:02 2018-06-27 14:17:54,554 DEBUG (jsonrpc/0) [Executor] > Worker started (executor:298) > > 14:19:02 2018-06-27 14:17:54,557 DEBUG (MainThread) [Executor] > Starting worker jsonrpc/1 (executor:286) > > 14:19:02 2018-06-27 14:17:54,558
[ovirt-devel] Re: [VDSM] all test passed, build failed with "tar: journal/b087148aba6d49b9bbef488e52a48752/system.journal: file changed as we read it"
On Thu, Nov 29, 2018 at 11:21 AM Edward Haas wrote: > > > On Thu, Nov 29, 2018 at 10:41 AM Edward Haas wrote: > >> >> >> On Wed, Nov 28, 2018 at 8:12 PM Nir Soffer wrote: >> >>> We have this failure that pops randomly: >>> >>> 1. All tests pass >>> >>> *00:13:13.284* ___ summary >>> *00:13:13.285* tests: commands >>> succeeded*00:13:13.286* storage-py27: commands succeeded*00:13:13.286* >>> storage-py36: commands succeeded*00:13:13.286* lib-py27: commands >>> succeeded*00:13:13.287* lib-py36: commands succeeded*00:13:13.288* >>> network-py27: commands succeeded*00:13:13.290* network-py36: commands >>> succeeded*00:13:13.291* virt-py27: commands succeeded*00:13:13.292* >>> virt-py36: commands succeeded*00:13:13.293* congratulations :) >>> >>> >>> 2. But we fail to collect logs at the end >>> >>> *00:14:35.992* >>> ##*00:14:35.995* ## >>> Wed Nov 28 17:39:50 UTC 2018 Finished env: >>> fc28:fedora-28-x86_64*00:14:35.996* ## took 764 seconds*00:14:35.997* >>> ## rc = 1*00:14:35.997* >>> ##*00:14:36.009* >>> ##! ERROR v*00:14:36.010* ##! Last >>> 20 log entries: >>> /tmp/mock_logs.Lcop4ZOq/script/stdout_stderr.log*00:14:36.011* >>> ##!*00:14:36.012* >>> journal/b087148aba6d49b9bbef488e52a48752/system.journal*00:14:36.013* tar: >>> journal/b087148aba6d49b9bbef488e52a48752/system.journal: file changed as we >>> read it*00:14:36.014* >>> journal/b087148aba6d49b9bbef488e52a48752/user-1000.journal*00:14:36.015* >>> lastlog*00:14:36.015* libvirt/*00:14:36.015* libvirt/lxc/*00:14:36.015* >>> libvirt/libxl/*00:14:36.016* libvirt/qemu/*00:14:36.016* >>> libvirt/qemu/LiveOS-f920001d-be4e-47ea-ac26-72480fd5be87.log*00:14:36.017* >>> libvirt/uml/*00:14:36.017* ovirt-guest-agent/*00:14:36.017* >>> ovirt-guest-agent/ovirt-guest-agent.log*00:14:36.017* README*00:14:36.018* >>> samba/*00:14:36.018* samba/old/*00:14:36.018* sssd/*00:14:36.018* >>> tallylog*00:14:36.018* wtmp*00:14:36.018* Took 678 seconds*00:14:36.018* >>> ===*00:14:36.019* ##!*00:14:36.019* ##! >>> ERROR ^^*00:14:36.019* >>> ##! >>> >>> >>> This looks like an issue with vdsm check-patch.sh: >>> >>> function collect_logs { >>> res=$? >>> [ "$res" -ne 0 ] && echo "*** err: $res" >>> cd /var/log >>> tar -cvzf "$EXPORT_DIR/mock_varlogs.tar.gz" * >>> cd /var/host_log >>> tar -cvzf "$EXPORT_DIR/host_varlogs.tar.gz" * >>> } >>> >>> trap collect_logs EXIT >>> >>> Seems that tar fail to collect log if the log is modified while copied, >>> which makes sense. >>> >>> We can ignore errors in tar, since log collection should not fail the >>> build, but I think >>> a better solution is to avoid collecting any logs since vdsm writes its >>> own logs during >>> tests - all the info must be in vdsm log. >>> >>> Here is the list of collected logs: >>> >>> *00:13:47.280* + tar -cvzf >>> /home/jenkins/workspace/vdsm_master_check-patch-fc28-x86_64/vdsm/exported-artifacts/mock_varlogs.tar.gz >>> btmp dnf.librepo.log dnf.log dnf.rpm.log faillog glusterfs hawkey.log >>> journal lastlog libvirt openvswitch README tallylog vdsm_tests.log wtmp >>> yum.log*00:13:47.285* btmp*00:13:47.285* dnf.librepo.log*00:13:47.299* >>> dnf.log*00:13:47.309* dnf.rpm.log*00:13:47.310* faillog*00:13:47.311* >>> glusterfs/*00:13:47.312* hawkey.log*00:13:47.313* journal/*00:13:47.313* >>> lastlog*00:13:47.315* libvirt/*00:13:47.315* libvirt/qemu/*00:13:47.316* >>> openvswitch/*00:13:47.317* openvswitch/ovs-vswitchd.log*00:13:47.318* >>> openvswitch/ovsdb-server.log*00:13:47.319* README*00:13:47.320* >>> tallylog*00:13:47.321* vdsm_tests.log*00:13:47.342* wtmp*00:13:47.343* >>> yum.log*00:13:47.349* + cd /var/host_log*00:13:47.350* + tar -cvzf >>> /home/jenkins/workspace/vdsm_master_check-patch-fc28-x86_64/vdsm/exported-artifacts/host_varlogs.tar.gz >>> anaconda audit boot.log btmp chrony cloud-init.log cloud-init-output.log >>> cron dnf.librepo.log dnf.log dnf.rpm.log firewalld glusterfs hawkey.log >>> journal lastlog libvirt ovirt-guest-agent README samba sssd tallylog >>> wtmp*00:13:47.356* anaconda/*00:13:47.356* anaconda/ifcfg.log*00:13:47.357* >>> anaconda/ks-script-l5qnynnj.log*00:13:47.358* >>> anaconda/storage.log*00:13:47.359* anaconda/program.log*00:13:47.395* >>> anaconda/ks-script-b5_08tmo.log*00:13:47.396* >>> anaconda/ks-script-6uks8bp3.log*00:13:47.397* >>> anaconda/hawkey.log*00:13:47.398* anaconda/syslog*00:13:47.406* >>> anaconda/journal.log*00:13:47.449* anaconda/dnf.librepo.log*00:13:47.458* >>> anaconda/packaging.log*00:13:47.465* anaconda/dbus.log*00:13:47.466* >>> anaconda/anaconda.log*00:13:47.467* >>>