Greg,

that's a great finding and a very good starting point.

If we want to stick with docker images and Firefox/Chrome testing, I still have some ideas, that would shorten the running time even more:

 * we do something like this:

    log("waiting %s sec for grid to initialize..." % GRID_STARTUP_DELAY)
    time.sleep(GRID_STARTUP_DELAY)

  this is very inefficient. We can change that to something like I wrote here (_wait_for_selenium_hub):

https://gerrit.ovirt.org/#/c/98135/2/common/test-scenarios-files/selenium/navigation/selenium_on_engine.py

  This function probably needs some improvement (i.e. urllib3 spits out warnings on an unsuccessful connection attempt, so they would need to be silenced), but that's a far better approach than a simple sleep.

 * parallelize running Firefox and Chrome tests - there's no reason not
   to run them both at the same time. There's something called
   VectorThread in lago.utils. A simple example of usage can be found
   in '004_basic_sanity.py:955' (disk_operations function). This would
   have a nice side effect of getting rid of the ugly global
   ovirt_driver - each thread would have it's own.

 * maybe not a running-time improvement, but I think
   https://gerrit.ovirt.org/#/c/98127/ is still relevant - the way we
   call save_screenshot is ugly and much too verbose

Right now, I have to switch my focus to some important stuff in VDSM - the OST patches were a continuation of a hackathon effort and something like a "side-project" ;) Still, I don't want the tread to die. I think there's a lot of room for improvements. I can rebase/improve some of my patches if you find them useful. Please keep me posted with your efforts!

Regards, Marcin

On 3/7/19 11:10 PM, Greg Sheremeta wrote:
Marcin,

It just dawned on me that the main reason 008's start_grid takes so long is that the docker images are fresh pulled every time. Several hundred MB, every time (ugh, sorry). We can and should cache them. What do you think about trying this before doing anything else? [it would also be a good time to update from actinium to the latest, iron.]

@Barak Korren <mailto:[email protected]> you once mentioned to me we should cache these if they are ok to cache (they are). How do we do that?

docker.io/selenium/node-chrome-debug <http://docker.io/selenium/node-chrome-debug>  3.9.1-actinium      327adc897d23        13 months ago *904 MB* docker.io/selenium/node-firefox-debug <http://docker.io/selenium/node-firefox-debug>  3.9.1-actinium      88649b420bd5        13 months ago *814 MB*

Greg


On Tue, Mar 5, 2019 at 6:15 AM Greg Sheremeta <[email protected] <mailto:[email protected]>> wrote:


    On Tue, Mar 5, 2019 at 4:55 AM Marcin Sobczyk <[email protected]
    <mailto:[email protected]>> wrote:

        Hi,

        On 3/4/19 7:07 PM, Greg Sheremeta wrote:
        Hi,

        Thanks for trying to improve the tests!

        I'm reluctant to give up Firefox sanity tests on every
        commit, though. In fact, I wanted to add Edge and Safari,
        because those are also supported browsers. Just today a
        Firefox only issue was reported, so they are valuable.

        Was the Firefox-only issue detected by basic suite or some
        other tests?

    It was reported by a developer. Because GWT compiles permutations
    per browser, and each browser therefore loads completely separate
    JavaScript payloads, it's just too easy for it to break in one
    browser and be fine in the other, so I'm really not ok to remove
    Firefox.

    If Admin Portal was React where there is a single JavaScript
    payload that's shared among all browsers, then I'd consider it.


        Did you consider either leaving a grid up permanently or
        perhaps using a third party like saucelabs?
        I did consider simply having our own grid for the OST.
        There's even a thread somewhere on ovirt-devel, where someone
        found OST trying to connect to one of my VMs in Tel Aviv,
        where my own grid was running :D
        I couldn't make a public demo though - OST executors couldn't
        see my VM in tlv.

        This approach has 2 big flaws:

          * it requires quite a lot of resources for the grid to
            always be there for us

    What about Saucelabs or another third party free tool?

          * it makes OST running times somehow undeterministic -
            situations, where WebDriver has to wait for Selenium
            hub/nodes to be free, will probably take place

        The way I see basic suite's UI sanity tests, is that they're
        exactly what they're called - sanity tests.
        We do trivial checks like "can we log in to the webadmin
        site", "can we go to 'virtual machines' sub-page".
        I'm not in favor of dropping these completely - I think they
        make sense, but I also think we can live with a trimmed-down
        version that saves a lot of time.
        As I said - AFAIK QE have their own Selenium grid, where they
        run more complex tests on the UI.


    Yes, OST basic_ui_sanity tests aren't "compatibility" tests. We're
    not checking pixels or look. They are super simple "does the app
    load" tests, are very valuable, and we're not dropping them.

    Greg

        Regards, Marcin



        Best wishes,
        Greg

        On Mon, Mar 4, 2019, 11:39 AM Marcin Sobczyk
        <[email protected] <mailto:[email protected]>> wrote:

            Hi,

            _TL; DR_ Let's cut the running time of
            '008_basic_ui_sanity.py' by more than 3 minutes by
            sacrificing Firefox and Chrome screenshots in favor of
            Chromium.

            During the OST hackathon in Brno this year, I saw an
            opportunity to optimize basic UI sanity tests from basic
            suite.
            The way we currently run them, is by setting up a
            Selenium grid using 3 docker containers, with a dedicated
            network... that's insanity! (pun intended).
            Let's a look at the running time of
            '008_basic_ui_sanity.py' scenario
            
(https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/4197/):


            01:31:50 @ Run test: 008_basic_ui_sanity.py:
            01:31:50 nose.config: INFO: Ignoring files matching
            ['^\\.', '^_', '^setup\\.py$']
            01:31:50   # init:
            01:31:50   # init: Success (in 0:00:00)
            01:31:50   # start_grid:
            01:34:05   # start_grid: Success (in 0:02:15)
            01:34:05   # initialize_chrome:
            01:34:18   # initialize_chrome: Success (in 0:00:13)
            01:34:18   # login:
            01:34:27   # login: Success (in 0:00:08)
            01:34:27   # left_nav:
            01:34:45   # left_nav: Success (in 0:00:18)
            01:34:45   # close_driver:
            01:34:46   # close_driver: Success (in 0:00:00)
            01:34:46   # initialize_firefox:
            01:35:02   # initialize_firefox: Success (in 0:00:16)
            01:35:02   # login:
            01:35:11   # login: Success (in 0:00:08)
            01:35:11   # left_nav:
            01:35:29   # left_nav: Success (in 0:00:18)
            01:35:29   # cleanup:
            01:35:36   # cleanup: Success (in 0:00:06)
            01:35:36   # Results located at
            
/dev/shm/ost/deployment-basic-suite-master/008_basic_ui_sanity.py.junit.xml
            01:35:36 @ Run test: 008_basic_ui_sanity.py: Success (in
            0:03:45)

            Starting the Selenium grid takes 2:15 out of 3:35 of
            total running time!

            I've investigated a lot of approaches and came up with
            something like this:

              * install 'chromium-headless' package on engine VM
              * download 'chromedriver' and 'selenium hub' jar and
                deploy them in '/var/opt/' on engine's VM
              * run 'selenium.jar' on engine VM from
                '008_basic_ui_sanity.py' by using Lago's ssh
              * connect to the Selenium instance running on the
                engine in '008_basic_ui_sanity.py'
              * make screenshots

            This series of patches represent the changes:
            
https://gerrit.ovirt.org/#/q/topic:selenium-on-engine+(status:open+OR+status:merged).
            This is the new running time
            (https://jenkins.ovirt.org/view/oVirt system
            tests/job/ovirt-system-tests_manual/4195/):

            20:13:26 @ Run test: 008_basic_ui_sanity.py:
            20:13:26 nose.config: INFO: Ignoring files matching
            ['^\\.', '^_', '^setup\\.py$']
            20:13:26   # init:
            20:13:26   # init: Success (in 0:00:00)
            20:13:26   # make_screenshots:
            20:13:27     * Retrying (Retry(total=2, connect=None,
            read=None, redirect=None, status=None)) after connection
            broken by
            'NewConnectionError('<urllib3.connection.HTTPConnection
            object at 0x7fdb6004f8d0>: Failed to establish a new
            connection: [Errno 111] Connection refused',)': /wd/hub
            20:13:27     * Retrying (Retry(total=1, connect=None,
            read=None, redirect=None, status=None)) after connection
            broken by
            'NewConnectionError('<urllib3.connection.HTTPConnection
            object at 0x7fdb6004fa10>: Failed to establish a new
            connection: [Errno 111] Connection refused',)': /wd/hub
            20:13:27     * Retrying (Retry(total=0, connect=None,
            read=None, redirect=None, status=None)) after connection
            broken by
            'NewConnectionError('<urllib3.connection.HTTPConnection
            object at 0x7fdb6004fb50>: Failed to establish a new
            connection: [Errno 111] Connection refused',)': /wd/hub
            20:13:28     * Redirecting
            http://192.168.201.4:4444/wd/hub ->
            http://192.168.201.4:4444/wd/hub/static/resource/hub.html
            20:14:02   # make_screenshots: Success (in 0:00:35)
            20:14:02   # Results located at
            
/dev/shm/ost/deployment-basic-suite-master/008_basic_ui_sanity.py.junit.xml
            20:14:02 @ Run test: 008_basic_ui_sanity.py: Success (in
            0:00:35)

            (The 'NewConnectionErrors' is waiting for Selenium hub to
            be up and running, I can silence these later).
            And the screenshots are here:
            
https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/4195/artifact/exported-artifacts/screenshots/

            _The pros:_

              * we cut the running time by more than 3 minutes

            _The cons:_

              * we don't get Firefox or Chrome screenshots - we get
                Chromium screenshots (although AFAIK, QE has much
                more Selenium tests which cover both Firefox and Chrome)
              * we polute the engine VM with 'chromium-headless'
                package and deps (in total: 'chromium-headless',
                'chromium-common', 'flac-libs' and 'minizip'),
                although we can remove these after the tests

            _Some design choices explained:_

            Q: Why engine VM?

            A: Because the engine VM already has 'X11' libs. We could
            install 'chromium-headless' (and even other browsers) on
            our Jenkins executors, but that would mess them up a lot.

            Q: Why Chromium?

            A: Because it has a separate 'headless' package.

            Q: Why not use 'chromedriver' RPM in favor of
            https://chromedriver.storage.googleapis.com Chromedriver
            builds?

            A: Because the RPM version pulls a lot of extra
            dependencies even on the engine VM ('gtk3', 'cairo'
            etc.). Builds from the URL are the offical Google
            Chromedriver builds, they contain a single binary, and
            they work for us.

            _What still needs to be polished with the patches:_

              * Currently 'setup_engine_selenium.sh' script downloads
                each time 'selenium.jar' and 'chromedriver.zip' (even
                with these downloads we get much faster set-up times)
                - we should bake these into the engine VM image template.
              * 'selenium_hub_running' function in
                'selenium_on_engine.py' is hackish - an ability to
                run an ssh command with a context manager (and
                auto-terminate on it exits) should be part of Lago.
                Can be refactored.

            Questions, comments, reviews are welcome.

            Regards, Marcin



            _______________________________________________
            Devel mailing list -- [email protected]
            <mailto:[email protected]>
            To unsubscribe send an email to [email protected]
            <mailto:[email protected]>
            Privacy Statement: https://www.ovirt.org/site/privacy-policy/
            oVirt Code of Conduct:
            https://www.ovirt.org/community/about/community-guidelines/
            List Archives:
            
https://lists.ovirt.org/archives/list/[email protected]/message/RLB2KSNJS4YKVMCDUUHOZJWBQDGJCXGZ/



--
    GREG SHEREMETA

    SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX

    Red Hat NA

    <https://www.redhat.com/>

    [email protected] <mailto:[email protected]> IRC: gshereme

    <https://red.ht/sig>



--

GREG SHEREMETA

SENIOR SOFTWARE ENGINEER - TEAM LEAD - RHV UX

Red Hat NA

<https://www.redhat.com/>

[email protected] <mailto:[email protected]> IRC: gshereme

<https://red.ht/sig>

_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/VH52GVG6V5SO34L56D7KETFX2HGIEPYR/

Reply via email to