Piotr, How do you identify the connection between the 2 message, why do you think that the timeout is related to the first message you've mentioned?
Regards, Yevgeny On Mon, May 15, 2017 at 12:40 PM, Piotr Kliczewski < [email protected]> wrote: > I agree with Nir that the above issue is not related to the job failure. > > I see that there is a timeout for: > > 2017-05-14 04:10:04,215-04 DEBUG > [org.ovirt.vdsm.jsonrpc.client.reactors.stomp.StompCommonClient] > (org.ovirt.thread.pool-7-thread-12) [] Message sent: SEND > destination:jms.topic.vdsm_requests > content-length:377 > reply-to:jms.topic.vdsm_responses > > <JsonRpcRequest id: "846f4741-5c7d-4396-8ed7-ce995d426728", method: > Host.setupNetworks, params: {networks={Migration_Net={vlan=200, > ipv6autoconf=false, nic=eth0, bridged=false, dhcpv6=false, mtu=1500, > ipv6addr=2001:0db8:85a3:0000:0000:574c:14ea:0a02/64, switch=legacy}}, > bondings={}, options={connectivityTimeout=120, > connectivityCheck=true}}> > > > and the timeout occurred at: > > 2017-05-14 04:10:07,535-04 DEBUG > [org.ovirt.vdsm.jsonrpc.client.internal.ResponseWorker] > (ResponseWorker) [] Message received: > {"jsonrpc":"2.0","error":{"code":"lago-basic-suite- > master-host1:286640862","message":"Vds > timeout occured"},"id":null} > > I see in supervdsm log: > > MainProcess|jsonrpc/1::INFO::2017-05-14 > 04:10:05,566::netconfpersistence::52::root::(setNetwork) Adding > network Migration_Net({u'ipv6autoconf': False, 'nameservers': [], > u'nic': u'eth0', u'vlan': 200, u'mtu': 1500, u'switch': u'legacy', > u'dhcpv6': False, u'bridged': False, u'ipv6addr': > u'2001:0db8:85a3:0000:0000:574c:14ea:0a02/64', 'defaultRoute': False, > 'bootproto': 'none'}) > ... > MainProcess|jsonrpc/1::WARNING::2017-05-14 > 04:10:05,580::ifcfg::247::root::(_addSourceRoute) Invalid input for > source routing: name=eth0.200, addr=None, netmask=None, gateway=None > netlink/events::DEBUG::2017-05-14 > 04:10:05,581::concurrent::184::root::(run) START thread > <Thread(netlink/events, started daemon 139829765465856)> (func=<bound > method Monitor._scan of <vdsm.network.netlink.monitor.Monitor object > at 0x7f2c8c01ebd0>>, args=(), kwargs={}) > MainProcess|jsonrpc/1::WARNING::2017-05-14 > 04:10:05,581::ifcfg::894::root::(_ignore_missing_device) eth0.200 > didn't exist (yet) and so IPv6 wasn't enabled. > > My question is why only 3 seconds to timeout setupNetworks. > > > > I found one more exception not related to the above issue which we should > fix: > > 2017-05-14 04:08:08,924-04 DEBUG > [org.ovirt.engine.core.utils.timer.SchedulerUtilQuartzImpl] > (DefaultQuartzScheduler10) [58fd90c1] Exception: > java.lang.reflect.InvocationTargetException > at sun.reflect.GeneratedMethodAccessor178.invoke(Unknown Source) > [:1.8.0_131] > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > [rt.jar:1.8.0_131] > at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_131] > at org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod( > JobWrapper.java:83) > [scheduler.jar:] > at org.ovirt.engine.core.utils.timer.JobWrapper.execute( > JobWrapper.java:55) > [scheduler.jar:] > at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:] > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [rt.jar:1.8.0_131] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [rt.jar:1.8.0_131] > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1142) > [rt.jar:1.8.0_131] > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:617) > [rt.jar:1.8.0_131] > at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131] > Caused by: java.lang.NullPointerException > at org.ovirt.engine.core.vdsbroker.monitoring.VmStatsRefresher.lambda$ > processDevices$0(VmStatsRefresher.java:44) > [vdsbroker.jar:] > at java.util.stream.ReferencePipeline$2$1.accept( > ReferencePipeline.java:174) > [rt.jar:1.8.0_131] > at java.util.stream.ReferencePipeline$3$1.accept( > ReferencePipeline.java:193) > [rt.jar:1.8.0_131] > at java.util.stream.ReferencePipeline$2$1.accept( > ReferencePipeline.java:175) > [rt.jar:1.8.0_131] > at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList. > java:1374) > [rt.jar:1.8.0_131] > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) > [rt.jar:1.8.0_131] > at java.util.stream.AbstractPipeline.wrapAndCopyInto( > AbstractPipeline.java:471) > [rt.jar:1.8.0_131] > at java.util.stream.ForEachOps$ForEachOp.evaluateSequential( > ForEachOps.java:151) > [rt.jar:1.8.0_131] > at java.util.stream.ForEachOps$ForEachOp$OfRef. > evaluateSequential(ForEachOps.java:174) > [rt.jar:1.8.0_131] > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > [rt.jar:1.8.0_131] > at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) > [rt.jar:1.8.0_131] > at org.ovirt.engine.core.vdsbroker.monitoring.VmStatsRefresher. > processDevices(VmStatsRefresher.java:46) > [vdsbroker.jar:] > at org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher.poll( > PollVmStatsRefresher.java:44) > [vdsbroker.jar:] > ... 11 more > > Thanks, > Piotr > > > On Sun, May 14, 2017 at 6:12 PM, Nir Soffer <[email protected]> wrote: > > On Sun, May 14, 2017 at 5:14 PM Shlomo Ben David <[email protected]> > > wrote: > >> > >> Link to suspected patches: N/A > >> Link to Job: > >> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_master/6665/ > >> Link to all logs: > >> http://jenkins.ovirt.org/job/test-repo_ovirt_experimental_ > master/6665/artifact/exported-artifacts/basic-suit-master- > el7/test_logs/basic-suite-master/post-006_migrations.py/ > >> > >> Error snippet from the log: > > > > > >> Traceback (most recent call last): > >> > >> File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line > 878, > >> in _run > >> return fn(*args, **kargs) > >> File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 52, in > >> wrapper > >> res = f(*args, **kwargs) > >> File "/usr/share/vdsm/storage/hsm.py", line 2132, in > getAllTasksStatuses > >> allTasksStatus = self._pool.getAllTasksStatuses() > >> File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", > line > >> 77, in wrapper > >> raise SecureError("Secured object is not in safe state") > >> SecureError: Secured object is not in safe state > > > > > > This means that getAllTasksStatuses (SPM verb) was called on a host > > which is not the SPM. This is an error in the caller, not in vdsm. > > > > We already have a bug abut this: > > https://bugzilla.redhat.com/1426429 > > > > However this build logs do not contain enough info. These > > patches adds more info that can help to identify the caller: > > https://gerrit.ovirt.org/76771 > > > > This error is not related to the real error in this build. > > > > Nir > > > > _______________________________________________ > > Devel mailing list > > [email protected] > > http://lists.ovirt.org/mailman/listinfo/devel > _______________________________________________ > Devel mailing list > [email protected] > http://lists.ovirt.org/mailman/listinfo/devel >
_______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
