On Sat, Dec 1, 2018 at 11:10 PM Nir Soffer <[email protected]> wrote:

> On Thu, Nov 29, 2018 at 11:21 AM Edward Haas <[email protected]> wrote:
>
>>
>>
>> On Thu, Nov 29, 2018 at 10:41 AM Edward Haas <[email protected]> wrote:
>>
>>>
>>>
>>> On Wed, Nov 28, 2018 at 8:12 PM Nir Soffer <[email protected]> wrote:
>>>
>>>> We have this failure that pops randomly:
>>>>
>>>> 1. All tests pass
>>>>
>>>> *00:13:13.284* ___________________________________ summary 
>>>> ____________________________________*00:13:13.285*   tests: commands 
>>>> succeeded*00:13:13.286*   storage-py27: commands succeeded*00:13:13.286*   
>>>> storage-py36: commands succeeded*00:13:13.286*   lib-py27: commands 
>>>> succeeded*00:13:13.287*   lib-py36: commands succeeded*00:13:13.288*   
>>>> network-py27: commands succeeded*00:13:13.290*   network-py36: commands 
>>>> succeeded*00:13:13.291*   virt-py27: commands succeeded*00:13:13.292*   
>>>> virt-py36: commands succeeded*00:13:13.293*   congratulations :)
>>>>
>>>>
>>>> 2. But we fail to collect logs at the end
>>>>
>>>> *00:14:35.992* 
>>>> ##########################################################*00:14:35.995* 
>>>> ## Wed Nov 28 17:39:50 UTC 2018 Finished env: 
>>>> fc28:fedora-28-x86_64*00:14:35.996* ##      took 764 seconds*00:14:35.997* 
>>>> ##      rc = 1*00:14:35.997* 
>>>> ##########################################################*00:14:36.009* 
>>>> ##! ERROR vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv*00:14:36.010* ##! Last 
>>>> 20 log entries: 
>>>> /tmp/mock_logs.Lcop4ZOq/script/stdout_stderr.log*00:14:36.011* 
>>>> ##!*00:14:36.012* 
>>>> journal/b087148aba6d49b9bbef488e52a48752/system.journal*00:14:36.013* tar: 
>>>> journal/b087148aba6d49b9bbef488e52a48752/system.journal: file changed as 
>>>> we read it*00:14:36.014* 
>>>> journal/b087148aba6d49b9bbef488e52a48752/user-1000.journal*00:14:36.015* 
>>>> lastlog*00:14:36.015* libvirt/*00:14:36.015* libvirt/lxc/*00:14:36.015* 
>>>> libvirt/libxl/*00:14:36.016* libvirt/qemu/*00:14:36.016* 
>>>> libvirt/qemu/LiveOS-f920001d-be4e-47ea-ac26-72480fd5be87.log*00:14:36.017* 
>>>> libvirt/uml/*00:14:36.017* ovirt-guest-agent/*00:14:36.017* 
>>>> ovirt-guest-agent/ovirt-guest-agent.log*00:14:36.017* README*00:14:36.018* 
>>>> samba/*00:14:36.018* samba/old/*00:14:36.018* sssd/*00:14:36.018* 
>>>> tallylog*00:14:36.018* wtmp*00:14:36.018* Took 678 seconds*00:14:36.018* 
>>>> ===================================*00:14:36.019* ##!*00:14:36.019* ##! 
>>>> ERROR ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^*00:14:36.019* 
>>>> ##!########################################################
>>>>
>>>>
>>>> This looks like an issue with vdsm check-patch.sh:
>>>>
>>>> function collect_logs {
>>>>     res=$?
>>>>     [ "$res" -ne 0 ] && echo "*** err: $res"
>>>>     cd /var/log
>>>>     tar -cvzf "$EXPORT_DIR/mock_varlogs.tar.gz" *
>>>>     cd /var/host_log
>>>>     tar -cvzf "$EXPORT_DIR/host_varlogs.tar.gz" *
>>>> }
>>>>
>>>> trap collect_logs EXIT
>>>>
>>>> Seems that tar fail to collect log if the log is modified while copied,
>>>> which makes sense.
>>>>
>>>> We can ignore errors in tar, since log collection should not fail the
>>>> build, but I think
>>>> a better solution is to avoid collecting any logs since vdsm writes its
>>>> own logs during
>>>> tests - all the info must be in vdsm log.
>>>>
>>>> Here is the list of collected logs:
>>>>
>>>> *00:13:47.280* + tar -cvzf 
>>>> /home/jenkins/workspace/vdsm_master_check-patch-fc28-x86_64/vdsm/exported-artifacts/mock_varlogs.tar.gz
>>>>  btmp dnf.librepo.log dnf.log dnf.rpm.log faillog glusterfs hawkey.log 
>>>> journal lastlog libvirt openvswitch README tallylog vdsm_tests.log wtmp 
>>>> yum.log*00:13:47.285* btmp*00:13:47.285* dnf.librepo.log*00:13:47.299* 
>>>> dnf.log*00:13:47.309* dnf.rpm.log*00:13:47.310* faillog*00:13:47.311* 
>>>> glusterfs/*00:13:47.312* hawkey.log*00:13:47.313* journal/*00:13:47.313* 
>>>> lastlog*00:13:47.315* libvirt/*00:13:47.315* libvirt/qemu/*00:13:47.316* 
>>>> openvswitch/*00:13:47.317* openvswitch/ovs-vswitchd.log*00:13:47.318* 
>>>> openvswitch/ovsdb-server.log*00:13:47.319* README*00:13:47.320* 
>>>> tallylog*00:13:47.321* vdsm_tests.log*00:13:47.342* wtmp*00:13:47.343* 
>>>> yum.log*00:13:47.349* + cd /var/host_log*00:13:47.350* + tar -cvzf 
>>>> /home/jenkins/workspace/vdsm_master_check-patch-fc28-x86_64/vdsm/exported-artifacts/host_varlogs.tar.gz
>>>>  anaconda audit boot.log btmp chrony cloud-init.log cloud-init-output.log 
>>>> cron dnf.librepo.log dnf.log dnf.rpm.log firewalld glusterfs hawkey.log 
>>>> journal lastlog libvirt ovirt-guest-agent README samba sssd tallylog 
>>>> wtmp*00:13:47.356* anaconda/*00:13:47.356* 
>>>> anaconda/ifcfg.log*00:13:47.357* 
>>>> anaconda/ks-script-l5qnynnj.log*00:13:47.358* 
>>>> anaconda/storage.log*00:13:47.359* anaconda/program.log*00:13:47.395* 
>>>> anaconda/ks-script-b5_08tmo.log*00:13:47.396* 
>>>> anaconda/ks-script-6uks8bp3.log*00:13:47.397* 
>>>> anaconda/hawkey.log*00:13:47.398* anaconda/syslog*00:13:47.406* 
>>>> anaconda/journal.log*00:13:47.449* anaconda/dnf.librepo.log*00:13:47.458* 
>>>> anaconda/packaging.log*00:13:47.465* anaconda/dbus.log*00:13:47.466* 
>>>> anaconda/anaconda.log*00:13:47.467* 
>>>> anaconda/ks-script-slrcz39_.log*00:13:47.503* audit/*00:13:47.504* 
>>>> audit/audit.log.3*00:13:47.657* audit/audit.log.2*00:13:47.814* 
>>>> audit/audit.log.1*00:13:47.981* audit/audit.log*00:13:48.008* 
>>>> audit/audit.log.4*00:13:48.155* boot.log*00:13:48.156* btmp*00:13:48.157* 
>>>> chrony/*00:13:48.159* cloud-init.log*00:13:48.159* 
>>>> cloud-init-output.log*00:13:48.161* cron*00:13:48.162* 
>>>> dnf.librepo.log*00:13:49.930* dnf.log*00:13:51.335* 
>>>> dnf.rpm.log*00:13:51.421* firewalld*00:13:51.423* glusterfs/*00:13:51.424* 
>>>> hawkey.log*00:13:51.704* journal/*00:13:51.708* 
>>>> journal/b087148aba6d49b9bbef488e52a48752/*00:13:51.709* 
>>>> journal/b087148aba6d49b9bbef488e52a48752/system.journal*00:13:55.817* tar: 
>>>> journal/b087148aba6d49b9bbef488e52a48752/system.journal: file changed as 
>>>> we read it*00:13:55.819* 
>>>> journal/b087148aba6d49b9bbef488e52a48752/user-1000.journal*00:13:55.915* 
>>>> lastlog*00:13:55.923* libvirt/*00:13:55.924* libvirt/lxc/*00:13:55.926* 
>>>> libvirt/libxl/*00:13:55.927* libvirt/qemu/*00:13:55.928* 
>>>> libvirt/qemu/LiveOS-f920001d-be4e-47ea-ac26-72480fd5be87.log*00:13:55.929* 
>>>> libvirt/uml/*00:13:55.930* ovirt-guest-agent/*00:13:55.930* 
>>>> ovirt-guest-agent/ovirt-guest-agent.log*00:13:55.932* README*00:13:55.933* 
>>>> samba/*00:13:55.933* samba/old/*00:13:55.935* sssd/*00:13:55.935* 
>>>> tallylog*00:13:55.935* wtmp
>>>>
>>>>
>>>> Most if not all are lot relevant to vdsm tests, and should not be
>>>> collected.
>>>>
>>>> This was added in:
>>>>
>>>> commit 9c9c17297433e5a5a49aa19cde10b206e7db61e9
>>>> Author: Edward Haas <[email protected]>
>>>> Date:   Tue Apr 17 10:53:11 2018 +0300
>>>>
>>>>     automation: Collect logs even when check-patch fails
>>>>
>>>>     Change-Id: Idfe07ce6fc55473b1db1d7f16754f559cc5c345a
>>>>     Signed-off-by: Edward Haas <[email protected]>
>>>>
>>>> Reviewed in:
>>>> https://gerrit.ovirt.org/c/90370
>>>>
>>>> Edward, can you explain why do we need to collect logs during
>>>> check-patch,
>>>> and why do we need to collect all the logs in the system?
>>>>
>>>
>>> check-patch are running unit and integrations tests.
>>> The integration tests are touching the OS and other packages (like
>>> openvswitch).
>>> It was added so we can debug why tests failed.
>>>
>>> I guess we can now separate the unit and integration tests, but it will
>>> not solve
>>> the problem presented here.
>>> Failing to collect the logs silently sounds a good enough solution to me.
>>>
>>
>> Barak suggested to just exclude the journal:
>> https://gerrit.ovirt.org/#/c/95850/
>>
>
> This is fixed now, thanks!
>
> Any reason why we exclude journal only for /var/log, and do collect the
> binary journal
> from /var/host_log? I guess it can fail in the same way.
>

As far as I know, /var/host_log is collected by the CI and placed (copy)
there.
It does not collect everything, therefore we added the second one.


> If we collect logs for integration tests, the most important log is the
> journal, and now
> we skip it.
>

I think it appears in /var/host_log.


>
> We can get the journal in a reliable way like this:
>
>     journalctl --since build-date > /tmp/journal.log
>
> Nir
>
>>
_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/OXVMG2FCQFB7ENGISOPNY23WJMI4FTMI/

Reply via email to