On Thu, Nov 29, 2018 at 11:21 AM Edward Haas <[email protected]> wrote:
> > > On Thu, Nov 29, 2018 at 10:41 AM Edward Haas <[email protected]> wrote: > >> >> >> On Wed, Nov 28, 2018 at 8:12 PM Nir Soffer <[email protected]> wrote: >> >>> We have this failure that pops randomly: >>> >>> 1. All tests pass >>> >>> *00:13:13.284* ___________________________________ summary >>> ____________________________________*00:13:13.285* tests: commands >>> succeeded*00:13:13.286* storage-py27: commands succeeded*00:13:13.286* >>> storage-py36: commands succeeded*00:13:13.286* lib-py27: commands >>> succeeded*00:13:13.287* lib-py36: commands succeeded*00:13:13.288* >>> network-py27: commands succeeded*00:13:13.290* network-py36: commands >>> succeeded*00:13:13.291* virt-py27: commands succeeded*00:13:13.292* >>> virt-py36: commands succeeded*00:13:13.293* congratulations :) >>> >>> >>> 2. But we fail to collect logs at the end >>> >>> *00:14:35.992* >>> ##########################################################*00:14:35.995* ## >>> Wed Nov 28 17:39:50 UTC 2018 Finished env: >>> fc28:fedora-28-x86_64*00:14:35.996* ## took 764 seconds*00:14:35.997* >>> ## rc = 1*00:14:35.997* >>> ##########################################################*00:14:36.009* >>> ##! ERROR vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv*00:14:36.010* ##! Last >>> 20 log entries: >>> /tmp/mock_logs.Lcop4ZOq/script/stdout_stderr.log*00:14:36.011* >>> ##!*00:14:36.012* >>> journal/b087148aba6d49b9bbef488e52a48752/system.journal*00:14:36.013* tar: >>> journal/b087148aba6d49b9bbef488e52a48752/system.journal: file changed as we >>> read it*00:14:36.014* >>> journal/b087148aba6d49b9bbef488e52a48752/user-1000.journal*00:14:36.015* >>> lastlog*00:14:36.015* libvirt/*00:14:36.015* libvirt/lxc/*00:14:36.015* >>> libvirt/libxl/*00:14:36.016* libvirt/qemu/*00:14:36.016* >>> libvirt/qemu/LiveOS-f920001d-be4e-47ea-ac26-72480fd5be87.log*00:14:36.017* >>> libvirt/uml/*00:14:36.017* ovirt-guest-agent/*00:14:36.017* >>> ovirt-guest-agent/ovirt-guest-agent.log*00:14:36.017* README*00:14:36.018* >>> samba/*00:14:36.018* samba/old/*00:14:36.018* sssd/*00:14:36.018* >>> tallylog*00:14:36.018* wtmp*00:14:36.018* Took 678 seconds*00:14:36.018* >>> ===================================*00:14:36.019* ##!*00:14:36.019* ##! >>> ERROR ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^*00:14:36.019* >>> ##!######################################################## >>> >>> >>> This looks like an issue with vdsm check-patch.sh: >>> >>> function collect_logs { >>> res=$? >>> [ "$res" -ne 0 ] && echo "*** err: $res" >>> cd /var/log >>> tar -cvzf "$EXPORT_DIR/mock_varlogs.tar.gz" * >>> cd /var/host_log >>> tar -cvzf "$EXPORT_DIR/host_varlogs.tar.gz" * >>> } >>> >>> trap collect_logs EXIT >>> >>> Seems that tar fail to collect log if the log is modified while copied, >>> which makes sense. >>> >>> We can ignore errors in tar, since log collection should not fail the >>> build, but I think >>> a better solution is to avoid collecting any logs since vdsm writes its >>> own logs during >>> tests - all the info must be in vdsm log. >>> >>> Here is the list of collected logs: >>> >>> *00:13:47.280* + tar -cvzf >>> /home/jenkins/workspace/vdsm_master_check-patch-fc28-x86_64/vdsm/exported-artifacts/mock_varlogs.tar.gz >>> btmp dnf.librepo.log dnf.log dnf.rpm.log faillog glusterfs hawkey.log >>> journal lastlog libvirt openvswitch README tallylog vdsm_tests.log wtmp >>> yum.log*00:13:47.285* btmp*00:13:47.285* dnf.librepo.log*00:13:47.299* >>> dnf.log*00:13:47.309* dnf.rpm.log*00:13:47.310* faillog*00:13:47.311* >>> glusterfs/*00:13:47.312* hawkey.log*00:13:47.313* journal/*00:13:47.313* >>> lastlog*00:13:47.315* libvirt/*00:13:47.315* libvirt/qemu/*00:13:47.316* >>> openvswitch/*00:13:47.317* openvswitch/ovs-vswitchd.log*00:13:47.318* >>> openvswitch/ovsdb-server.log*00:13:47.319* README*00:13:47.320* >>> tallylog*00:13:47.321* vdsm_tests.log*00:13:47.342* wtmp*00:13:47.343* >>> yum.log*00:13:47.349* + cd /var/host_log*00:13:47.350* + tar -cvzf >>> /home/jenkins/workspace/vdsm_master_check-patch-fc28-x86_64/vdsm/exported-artifacts/host_varlogs.tar.gz >>> anaconda audit boot.log btmp chrony cloud-init.log cloud-init-output.log >>> cron dnf.librepo.log dnf.log dnf.rpm.log firewalld glusterfs hawkey.log >>> journal lastlog libvirt ovirt-guest-agent README samba sssd tallylog >>> wtmp*00:13:47.356* anaconda/*00:13:47.356* anaconda/ifcfg.log*00:13:47.357* >>> anaconda/ks-script-l5qnynnj.log*00:13:47.358* >>> anaconda/storage.log*00:13:47.359* anaconda/program.log*00:13:47.395* >>> anaconda/ks-script-b5_08tmo.log*00:13:47.396* >>> anaconda/ks-script-6uks8bp3.log*00:13:47.397* >>> anaconda/hawkey.log*00:13:47.398* anaconda/syslog*00:13:47.406* >>> anaconda/journal.log*00:13:47.449* anaconda/dnf.librepo.log*00:13:47.458* >>> anaconda/packaging.log*00:13:47.465* anaconda/dbus.log*00:13:47.466* >>> anaconda/anaconda.log*00:13:47.467* >>> anaconda/ks-script-slrcz39_.log*00:13:47.503* audit/*00:13:47.504* >>> audit/audit.log.3*00:13:47.657* audit/audit.log.2*00:13:47.814* >>> audit/audit.log.1*00:13:47.981* audit/audit.log*00:13:48.008* >>> audit/audit.log.4*00:13:48.155* boot.log*00:13:48.156* btmp*00:13:48.157* >>> chrony/*00:13:48.159* cloud-init.log*00:13:48.159* >>> cloud-init-output.log*00:13:48.161* cron*00:13:48.162* >>> dnf.librepo.log*00:13:49.930* dnf.log*00:13:51.335* >>> dnf.rpm.log*00:13:51.421* firewalld*00:13:51.423* glusterfs/*00:13:51.424* >>> hawkey.log*00:13:51.704* journal/*00:13:51.708* >>> journal/b087148aba6d49b9bbef488e52a48752/*00:13:51.709* >>> journal/b087148aba6d49b9bbef488e52a48752/system.journal*00:13:55.817* tar: >>> journal/b087148aba6d49b9bbef488e52a48752/system.journal: file changed as we >>> read it*00:13:55.819* >>> journal/b087148aba6d49b9bbef488e52a48752/user-1000.journal*00:13:55.915* >>> lastlog*00:13:55.923* libvirt/*00:13:55.924* libvirt/lxc/*00:13:55.926* >>> libvirt/libxl/*00:13:55.927* libvirt/qemu/*00:13:55.928* >>> libvirt/qemu/LiveOS-f920001d-be4e-47ea-ac26-72480fd5be87.log*00:13:55.929* >>> libvirt/uml/*00:13:55.930* ovirt-guest-agent/*00:13:55.930* >>> ovirt-guest-agent/ovirt-guest-agent.log*00:13:55.932* README*00:13:55.933* >>> samba/*00:13:55.933* samba/old/*00:13:55.935* sssd/*00:13:55.935* >>> tallylog*00:13:55.935* wtmp >>> >>> >>> Most if not all are lot relevant to vdsm tests, and should not be >>> collected. >>> >>> This was added in: >>> >>> commit 9c9c17297433e5a5a49aa19cde10b206e7db61e9 >>> Author: Edward Haas <[email protected]> >>> Date: Tue Apr 17 10:53:11 2018 +0300 >>> >>> automation: Collect logs even when check-patch fails >>> >>> Change-Id: Idfe07ce6fc55473b1db1d7f16754f559cc5c345a >>> Signed-off-by: Edward Haas <[email protected]> >>> >>> Reviewed in: >>> https://gerrit.ovirt.org/c/90370 >>> >>> Edward, can you explain why do we need to collect logs during >>> check-patch, >>> and why do we need to collect all the logs in the system? >>> >> >> check-patch are running unit and integrations tests. >> The integration tests are touching the OS and other packages (like >> openvswitch). >> It was added so we can debug why tests failed. >> >> I guess we can now separate the unit and integration tests, but it will >> not solve >> the problem presented here. >> Failing to collect the logs silently sounds a good enough solution to me. >> > > Barak suggested to just exclude the journal: > https://gerrit.ovirt.org/#/c/95850/ > This is fixed now, thanks! Any reason why we exclude journal only for /var/log, and do collect the binary journal from /var/host_log? I guess it can fail in the same way. If we collect logs for integration tests, the most important log is the journal, and now we skip it. We can get the journal in a reliable way like this: journalctl --since build-date > /tmp/journal.log Nir >
_______________________________________________ Devel mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/AXWN4QVWNAHGUN2XCFH5YSCFW2OBJWH3/
