On Mon, Jul 16, 2018 at 2:19 PM, Steven Rosenberg <[email protected]> wrote:
> Hi, > > I reran the test on: https://gerrit.ovirt.org/#/c/92734/ > > 3 tests passed and two failed on: > > 004_basic_sanity.vm_run > > The error is: > > detail: Cannot run VM. There is no host that satisfies current scheduling > constraints. See below for details:, The host > lago-he-basic-ansible-suite-master-host-0 did not satisfy internal filter > Memory because its available memory is too low (656 MB) to run the VM. > > > It looks like there is not enough memory on the test environment and that > may be causing intermittent errors on the other tests as well. > > Is it possible to increase the memory for the CI environment so that we > can ensure the errors are not due to lack of resources? > We already increased the RAM in https://gerrit.ovirt.org/#/c/92902/. Can you please rebase your patch? > > With Best Regards. > > Steven Rosenberg. > > > On Mon, Jul 16, 2018 at 12:03 PM Andrej Krejcir <[email protected]> > wrote: > >> Hi, >> >> The jenkins job is no longer available, I will run the test manually and >> see why it fails. >> >> Maybe this patch will fix the problem with the test: >> https://gerrit.ovirt.org/#/c/92734/ >> It ensures that all services are stopped before starting them again. >> >> The purpose of the test is to try restarting the HE VM. >> When the HE VM is run for the first time, the engine generates an OVF >> file describing the HE VM. Then when the VM is started again, the OVF is >> used. We wanted to test that the generated OVF is correct and the VM can be >> started. The test intended to cleanly shutdown and restart the vdsm. >> >> >> Andrej >> >> >> On Fri, 6 Jul 2018 at 11:38, Sandro Bonazzola <[email protected]> >> wrote: >> >>> Failing job is: https://jenkins.ovirt.org/job/ovirt-system-tests_he- >>> basic-iscsi-suite-master/308 >>> My findings: >>> >>> 2018-07-06 >>> 03:28:05,461::008_restart_he_vm.py::_shutdown_he_vm::123::root::INFO:: * >>> VM is down. >>> 2018-07-06 >>> 03:28:05,461::008_restart_he_vm.py::_restart_services::127::root::INFO:: >>> * Restarting services... >>> >>> 2018-07-06 03:28:05,729::log_utils.py::__exit__::611::lago.ssh::DEBUG::end >>> task:f7db7960-b541-4d18-9a73-45b3d0677f03:Get ssh client for >>> lago-he-basic-iscsi-suite-master-host-0: >>> 2018-07-06 03:28:06,031::ssh.py::ssh::58::lago.ssh::DEBUG::Running 98d0de9e >>> on lago-he-basic-iscsi-suite-master-host-0: systemctl restart vdsmd >>> ovirt-ha-broker ovirt-ha-agent >>> 2018-07-06 03:28:22,887::ssh.py::ssh::81::lago.ssh::DEBUG::Command 98d0de9e >>> on lago-he-basic-iscsi-suite-master-host-0 returned with 0 >>> >>> >>> it then waits for engine to be up again till it gives up 10 minutes >>> later: >>> >>> 2018-07-06 >>> 03:38:20,979::log_utils.py::__enter__::600::lago.ssh::DEBUG::start >>> task:f44bb656-dfe3-4f57-a6c8-e7e208863054:Get ssh client for >>> lago-he-basic-iscsi-suite-master-host-0: >>> 2018-07-06 03:38:21,290::log_utils.py::__exit__::611::lago.ssh::DEBUG::end >>> task:f44bb656-dfe3-4f57-a6c8-e7e208863054:Get ssh client for >>> lago-he-basic-iscsi-suite-master-host-0: >>> 2018-07-06 03:38:21,591::ssh.py::ssh::58::lago.ssh::DEBUG::Running 07b7f08a >>> on lago-he-basic-iscsi-suite-master-host-0: hosted-engine --vm-status >>> 2018-07-06 03:38:21,637::ssh.py::ssh::81::lago.ssh::DEBUG::Command 07b7f08a >>> on lago-he-basic-iscsi-suite-master-host-0 returned with 1 >>> 2018-07-06 03:38:21,637::ssh.py::ssh::89::lago.ssh::DEBUG::Command 07b7f08a >>> on lago-he-basic-iscsi-suite-master-host-0 output: >>> The hosted engine configuration has not been retrieved from shared >>> storage. Please ensure that ovirt-ha-agent is running and the storage >>> server is reachable. >>> >>> >>> On host 0, vdsm.log shows it's restarting: >>> >>> 2018-07-05 23:28:19,646-0400 INFO (MainThread) [vds] Exiting (vdsmd:171) >>> 2018-07-05 23:28:23,514-0400 INFO (MainThread) [vds] (PID: 19942) I am the >>> actual vdsm 4.30.0-465.git1ad18aa.el7 >>> lago-he-basic-iscsi-suite-master-host-0 (3.10.0-862.2.3.el7.x86_64) >>> (vdsmd:149) >>> >>> vdsm then stays for the 10 minutes above mentioned waiting for the >>> storage pool to go up: >>> >>> 2018-07-05 23:38:20,739-0400 INFO (vmrecovery) [vds] recovery: waiting for >>> storage pool to go up (clientIF:704) >>> >>> while the ha agent try to get the hosted engine stats >>> >>> 2018-07-05 23:38:30,363-0400 WARN (vdsm.Scheduler) [Executor] Worker >>> blocked: <Worker name=jsonrpc/7 running <Task <JsonRpcTask {'params': {}, >>> 'jsonrpc': '2.0', 'method': u'Host.getStats', 'id': >>> u'cddde340-37a8-4f72-a471-b5bc40c06a16'} at 0x7f262814ae90> timeout=60, >>> duration=600.00 at 0x7f262815c050> task#=0 at 0x7f262834b810>, traceback: >>> File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap >>> self.__bootstrap_inner() >>> File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner >>> self.run() >>> File: "/usr/lib64/python2.7/threading.py", line 765, in run >>> self.__target(*self.__args, **self.__kwargs) >>> File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line >>> 195, in run >>> ret = func(*args, **kwargs) >>> File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run >>> self._execute_task() >>> File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in >>> _execute_task >>> task() >>> File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in >>> __call__ >>> self._callable() >>> File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 261, >>> in __call__ >>> self._handler(self._ctx, self._req) >>> File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 304, >>> in _serveRequest >>> response = self._handle_request(req, ctx) >>> File: "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 344, >>> in _handle_request >>> res = method(**params) >>> File: "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 202, in >>> _dynamicMethod >>> result = fn(*methodArgs) >>> File: "<string>", line 2, in getStats >>> File: "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, in >>> method >>> ret = func(*args, **kwargs) >>> File: "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1409, in getStats >>> multipath=True)} >>> File: "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 78, in >>> get_stats >>> ret['haStats'] = _getHaInfo() >>> File: "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 176, in >>> _getHaInfo >>> stats = instance.get_all_stats() >>> File: >>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", >>> line 94, in get_all_stats >>> stats = broker.get_stats_from_storage() >>> File: >>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", >>> line 135, in get_stats_from_storage >>> result = self._proxy.get_stats() >>> File: "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__ >>> return self.__send(self.__name, args) >>> File: "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request >>> verbose=self.__verbose >>> File: "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request >>> return self.single_request(host, handler, request_body, verbose) >>> File: "/usr/lib64/python2.7/xmlrpclib.py", line 1303, in single_request >>> response = h.getresponse(buffering=True) >>> File: "/usr/lib64/python2.7/httplib.py", line 1113, in getresponse >>> response.begin() >>> File: "/usr/lib64/python2.7/httplib.py", line 444, in begin >>> version, status, reason = self._read_status() >>> File: "/usr/lib64/python2.7/httplib.py", line 400, in _read_status >>> line = self.fp.readline(_MAXLINE + 1) >>> File: "/usr/lib64/python2.7/socket.py", line 476, in readline >>> data = self._sock.recv(self._rbufsize) (executor:363) >>> >>> >>> Andrej, I see the test has been added by you in commit >>> e8d32f7375f2033b73544f47c1e1ca67abe8d35a >>> I'm not sure about the purpose of this test but I don't understand why >>> we are restarting the services on the host. >>> >>> Nir, Tal, any idea on why the storage pool is not getting up? >>> >>> I see vdsm is in recovery mode, I'm not sure if this was what the test was >>> supposed to do or if the intention was to cleanly shutdown vdsm and cleanly >>> restart it. >>> >>> >>> >>> >>> -- >>> >>> SANDRO BONAZZOLA >>> >>> MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV >>> >>> Red Hat EMEA <https://www.redhat.com/> >>> >>> [email protected] >>> <https://red.ht/sig> >>> >> > _______________________________________________ > Devel mailing list -- [email protected] > To unsubscribe send an email to [email protected] > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: https://www.ovirt.org/community/about/community- > guidelines/ > List Archives: https://lists.ovirt.org/archives/list/[email protected]/ > message/QONCFDY7AEZH2SV2ZCWFGGYIBZIBFJWP/ > > -- *GAL bEN HAIM* RHV DEVOPS
_______________________________________________ Devel mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/NAUCG6HINTYZ5RBKVAEQDCANYGI3WXEL/
