On 11/22/19 4:54 PM, Martin Perina wrote:


On Fri, Nov 22, 2019 at 4:43 PM Dominik Holler <[email protected] <mailto:[email protected]>> wrote:


    On Fri, Nov 22, 2019 at 12:17 PM Dominik Holler
    <[email protected] <mailto:[email protected]>> wrote:



        On Fri, Nov 22, 2019 at 12:00 PM Miguel Duarte de Mora Barroso
        <[email protected] <mailto:[email protected]>> wrote:

            On Fri, Nov 22, 2019 at 11:54 AM Vojtech Juranek
            <[email protected] <mailto:[email protected]>> wrote:
            >
            > On pátek 22. listopadu 2019 9:56:56 CET Miguel Duarte de
            Mora Barroso wrote:
            > > On Fri, Nov 22, 2019 at 9:49 AM Vojtech Juranek
            <[email protected] <mailto:[email protected]>>
            > > wrote:
            > > >
            > > >
            > > > On pátek 22. listopadu 2019 9:41:26 CET Dominik
            Holler wrote:
            > > >
            > > > > On Fri, Nov 22, 2019 at 8:40 AM Dominik Holler
            <[email protected] <mailto:[email protected]>>
            > > > > wrote:
            > > > >
            > > > > > On Thu, Nov 21, 2019 at 10:54 PM Nir Soffer
            <[email protected] <mailto:[email protected]>>
            > > > > > wrote:
            > > > > >
            > > > > >> On Thu, Nov 21, 2019 at 11:24 PM Vojtech Juranek
            > > > > >> <[email protected] <mailto:[email protected]>>
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> wrote:
            > > > > >>
            > > > > >> > Hi,
            > > > > >> > OST fails (see e.g. [1]) in
            002_bootstrap.check_update_host. It
            > > > > >> > fails
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> with
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> >  FAILED! => {"changed": false, "failures":
            [], "msg": "Depsolve
            > > > > >> >  Error
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> occured:
            > > > > >>
            > > > > >> > \n Problem 1: cannot install the best update
            candidate for package
            > > > > >> > vdsm-
            > > > > >> >
            network-4.40.0-1236.git63ea8cb8b.el8.x86_64\n  - nothing
            provides
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> nmstate
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> > needed by
            vdsm-network-4.40.0-1271.git524e08c8a.el8.x86_64\n
            > > > > >> > Problem 2:
            > > > > >> > package
            vdsm-python-4.40.0-1271.git524e08c8a.el8.noarch requires
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> vdsm-network
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> > = 4.40.0-1271.git524e08c8a.el8, but none of
            the providers can be
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> installed\n
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> > - cannot install the best update candidate
            for package vdsm-
            > > > > >> > python-4.40.0-1236.git63ea8cb8b.el8.noarch\n 
            - nothing provides
            > > > > >> > nmstate
            > > > > >> > needed by
            vdsm-network-4.40.0-1271.git524e08c8a.el8.x86_64\n
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> nmstate should be provided by copr repo enabled by
            > > > > >> ovirt-release-master.
            > > > > >
            > > > > >
            > > > > >
            > > > > > I re-triggered as
            > > > > >
            https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6131
            > > > > > maybe
            > > > > > https://gerrit.ovirt.org/#/c/104825/
            > > > > > was missing
            > > > >
            > > > >
            > > > >
            > > > > Looks like
            > > > > https://gerrit.ovirt.org/#/c/104825/ is ignored by
            OST.
            > > >
            > > >
            > > >
            > > > maybe not. You re-triggered with [1], which really
            missed this patch.
            > > > I did a rebase and now running with this patch in
            build #6132 [2]. Let's
            > > > wait
            >  for it to see if gerrit #104825 helps.
            > > >
            > > >
            > > >
            > > > [1]
            https://jenkins.ovirt.org/job/standard-manual-runner/909/
            > > > [2]
            https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6132/
            > > >
            > > >
            > > >
            > > > > Miguel, do you think merging
            > > > >
            > > > >
            > > > >
            > > > >
            
https://gerrit.ovirt.org/#/c/104495/15/common/yum-repos/ovirt-master-hos
            > > > > t-cq
            >  .repo.in <http://repo.in>
            > > > >
            > > > >
            > > > >
            > > > > would solve this?
            > >
            > >
            > > I've split the patch Dominik mentions above in two,
            one of them adding
            > > the nmstate / networkmanager copr repos - [3].
            > >
            > > Let's see if it fixes it.
            >
            > it fixes original issue, but OST still fails in
            > 098_ovirt_provider_ovn.use_ovn_provider:
            >
            > https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6134

            I think Dominik was looking into this issue; +Dominik
            Holler please confirm.

            Let me know if you need any help Dominik.



        Thanks.
        The problem is that the hosts lost connection to storage:
        
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6134/artifact/exported-artifacts/test_logs/basic-suite-master/post-098_ovirt_provider_ovn.py/lago-basic-suite-master-host-0/_var_log/vdsm/vdsm.log
        :

        2019-11-22 05:39:12,326-0500 DEBUG (jsonrpc/5) [common.commands] /usr/bin/taskset --cpu-list 0-1 
/usr/bin/sudo -n /sbin/lvm vgs --config 'devices {  preferred_names=["^/dev/mapper/"]  
ignore_suspended_devices=1  write_cache_state=0  disable_after_error_count=3  
filter=["a|^/dev/mapper/36001405107ea8b4e3ac4ddeb3e19890f$|^/dev/mapper/360014054924c91df75e41178e4b8a80c$|^/dev/mapper/3600140561c0d02829924b77ab7323f17$|^/dev/mapper/3600140582feebc04ca5409a99660dbbc$|^/dev/mapper/36001405c3c53755c13c474dada6be354$|",
 "r|.*|"] } global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 } 
backup {  retain_min=50  retain_days=0 }' --noheadings --units b --nosuffix --separator '|' 
--ignoreskippedcluster -o 
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
 (cwd None) (commands:153)
        2019-11-22 05:39:12,415-0500 ERROR (check/loop) [storage.Monitor] Error 
checking path 
/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share2/d10879c6-8de1-40ba-87fa-f447844eed2a/dom_md/metadata
 (monitor:501)
        Traceback (most recent call last):
           File "/usr/lib/python3.6/site-packages/vdsm/storage/monitor.py", 
line 499, in _pathChecked
             delay = result.delay()
           File "/usr/lib/python3.6/site-packages/vdsm/storage/check.py", line 
391, in delay
             raise exception.MiscFileReadException(self.path, self.rc, self.err)
        vdsm.storage.exception.MiscFileReadException: Internal file read 
failure: 
('/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share2/d10879c6-8de1-40ba-87fa-f447844eed2a/dom_md/metadata',
 1, 'Read timeout')
        2019-11-22 05:39:12,416-0500 INFO  (check/loop) [storage.Monitor] 
Domain d10879c6-8de1-40ba-87fa-f447844eed2a became INVALID (monitor:472)


        I failed to reproduce local to analyze this, I will try again,
        any hints welcome.



    https://gerrit.ovirt.org/#/c/104925/1/ shows that
    008_basic_ui_sanity.py triggers the problem.
    Is there someone with knowledge about the basic_ui_sanity around?

How do you think it's related? By commenting out the ui sanity tests and seeing OST with successful finish?

Looking at 6134 run you were discussing:

 - timing of the ui sanity set-up [1]:

11:40:20 @ Run test: 008_basic_ui_sanity.py:

- timing of first encountered storage error [2]:

2019-11-22 05:39:12,415-0500 ERROR (check/loop) [storage.Monitor] Error checking path /rhev/data-center/mnt/192.168.200.4:_exports_nfs_share2/d10879c6-8de1-40ba-87fa-f447844eed2a/dom_md/metadata (monitor:501)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/storage/monitor.py", line 499, in _pathChecked
    delay = result.delay()
  File "/usr/lib/python3.6/site-packages/vdsm/storage/check.py", line 391, in delay
    raise exception.MiscFileReadException(self.path, self.rc, self.err)
vdsm.storage.exception.MiscFileReadException: Internal file read failure: ('/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share2/d10879c6-8de1-40ba-87fa-f447844eed2a/dom_md/metadata', 1, 'Read timeout')

Timezone difference aside, it seems to me that these storage errors occured before doing anything ui-related. I remember talking with Steven Rosenberg on IRC a couple of days ago about some storage metadata issues and he said he got a response from Nir, that "it's a known issue".

Nir, Amit, can you comment on this?

[1] https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6134/console
[2] https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6134/artifact/exported-artifacts/test_logs/basic-suite-master/post-098_ovirt_provider_ovn.py/lago-basic-suite-master-host-0/_var_log/vdsm/vdsm.log



Marcin, could you please take a look?


            >
            > > [3] - https://gerrit.ovirt.org/#/c/104897/
            > >
            > >
            > > > >
            > > > >
            > > > > >> Who installs this rpm in OST?
            > > > > >
            > > > > >
            > > > > >
            > > > > > I do not understand the question.
            > > > > >
            > > > > >
            > > > > >
            > > > > >> > [...]
            > > > > >> >
            > > > > >> >
            > > > > >> >
            > > > > >> > See [2] for full error.
            > > > > >> >
            > > > > >> >
            > > > > >> >
            > > > > >> > Can someone please take a look?
            > > > > >> > Thanks
            > > > > >> > Vojta
            > > > > >> >
            > > > > >> >
            > > > > >> >
            > > > > >> > [1]
            https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6128/
            > > > > >> > [2]
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >>
            
https://jenkins.ovirt.org/job/ovirt-system-tests_manual/6128/artifact
            > > > > >> /
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> > exported-artifacts/test_logs/basic-suite-master/
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> post-002_bootstrap.py/lago-
            <http://post-002_bootstrap.py/lago->
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >>
            
basic-suite-master-engine/_var_log/ovirt-engine/engine.log___________
            > > > > >> ____
            > > > > >> ________________________________>>
            > > > > >>
            > > > > >> > Devel mailing list -- [email protected]
            <mailto:[email protected]>
            > > > > >> > To unsubscribe send an email to
            [email protected] <mailto:[email protected]>
            > > > > >> > Privacy Statement:
            https://www.ovirt.org/site/privacy-policy/
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> > oVirt Code of Conduct:
            > > > > >>
            > > > > >>
            https://www.ovirt.org/community/about/community-guidelines/
            > > > > >>
            > > > > >>
            > > > > >>
            > > > > >> > List Archives:
            > > > > >>
            > > > > >>
            
https://lists.ovirt.org/archives/list/[email protected]/message/4K5N3VQ
            > > > > >> N26B
            > > > > >> L73K7D45A2IR7R3UMMM23/
            > > > > >> _______________________________________________
            > > > > >> Devel mailing list -- [email protected]
            <mailto:[email protected]>
            > > > > >> To unsubscribe send an email to
            [email protected] <mailto:[email protected]>
            > > > > >> Privacy Statement:
            https://www.ovirt.org/site/privacy-policy/
            > > > > >> oVirt Code of Conduct:
            > > > > >>
            https://www.ovirt.org/community/about/community-guidelines/
            > > > > >> List Archives:
            > > > > >>
            
https://lists.ovirt.org/archives/list/[email protected]/message/JN7MNUZ
            > > > > >> N5K3
            > > > > >> NS5TGXFCILYES77KI5TZU/
            > > >
            > > >
            > >
            > > _______________________________________________
            > > Devel mailing list -- [email protected]
            <mailto:[email protected]>
            > > To unsubscribe send an email to [email protected]
            <mailto:[email protected]>
            > > Privacy Statement:
            https://www.ovirt.org/site/privacy-policy/
            > > oVirt Code of Conduct:
            > >
            https://www.ovirt.org/community/about/community-guidelines/
            List Archives:
            > >
            
https://lists.ovirt.org/archives/list/[email protected]/message/UPJ5SEAV5Z65H
            > > 5BQ3SCHOYZX6JMTQPBW/
            >



--
Martin Perina
Manager, Software Engineering
Red Hat Czech s.r.o.

_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/MDUKRDMXJ3UWKRLX5Y7TYIPQKGC43TDU/

Reply via email to