On Thu, Jan 12, 2017 at 4:58 PM, Piotr Kliczewski < [email protected]> wrote:
> On Thu, Jan 12, 2017 at 8:52 AM, Roy Golan <[email protected]> wrote: > > > > > > On 11 January 2017 at 21:50, Piotr Kliczewski < > [email protected]> > > wrote: > >> > >> On Wed, Jan 11, 2017 at 9:47 AM, Daniel Belenky <[email protected]> > >> wrote: > >> > Hi all, > >> > > >> > The following job: test-repo_ovirt_experimental_master fails to pass > the > >> > basic_suite. > >> > The job was triggered by this merge: https://gerrit.ovirt.org/#/c/ > 69936/ > >> > to > >> > vdsm project. > >> > > >> > The error I suspect cause this issue: > >> > > >> > 2017-01-11 03:32:26,061-05 DEBUG > >> > [org.ovirt.vdsm.jsonrpc.client.internal.ResponseWorker] > (ResponseWorker) > >> > [] > >> > Message received: > >> > > >> > {"jsonrpc":"2.0","error":{"code":"192.168.201.2:990178830 > ","message":"Vds > >> > timeout occured"},"id":null} > >> > 2017-01-11 03:32:26,067-05 ERROR > >> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling. > AuditLogDirector] > >> > (DefaultQuartzScheduler7) [57bc898] Correlation ID: null, Call Stack: > >> > null, > >> > Custom Event ID: -1, Message: VDSM command failed: Message timeout > which > >> > can > >> > be caused by communication issues > >> > 2017-01-11 03:32:26,069-05 ERROR > >> > [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] > >> > (DefaultQuartzScheduler7) [57bc898] ERROR, > GetStoragePoolInfoVDSCommand( > >> > GetStoragePoolInfoVDSCommandParameters:{runAsync='true', > >> > storagePoolId='f92af272-934f-4327-9db0-afe353e6f61c', > >> > ignoreFailoverLimit='true'}), exception: VDSGenericException: > >> > VDSNetworkException: Message timeout which can be caused by > >> > communication > >> > issues, log id: 2f12b94a > >> > 2017-01-11 03:32:26,069-05 ERROR > >> > [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] > >> > (DefaultQuartzScheduler7) [57bc898] Exception: > >> > org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: > >> > VDSGenericException: VDSNetworkException: Message timeout which can be > >> > caused by communication issues > >> > at > >> > > >> > org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase. > proceedProxyReturnValue(BrokerCommandBase.java:188) > >> > [vdsbroker.jar:] > >> > at > >> > > >> > org.ovirt.engine.core.vdsbroker.irsbroker. > GetStoragePoolInfoVDSCommand.executeIrsBrokerCommand( > GetStoragePoolInfoVDSCommand.java:32) > >> > [vdsbroker.jar:] > >> > at > >> > > >> > org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand.lambda$ > executeVDSCommand$0(IrsBrokerCommand.java:95) > >> > [vdsbroker.jar:] > >> > at > >> > > >> > org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy. > runInControlledConcurrency(IrsProxy.java:262) > >> > [vdsbroker.jar:] > >> > at > >> > > >> > org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand. > executeVDSCommand(IrsBrokerCommand.java:92) > >> > [vdsbroker.jar:] > >> > at > >> > > >> > org.ovirt.engine.core.vdsbroker.VDSCommandBase. > executeCommand(VDSCommandBase.java:73) > >> > [vdsbroker.jar:] > >> > at > >> > org.ovirt.engine.core.dal.VdcCommandBase.execute( > VdcCommandBase.java:33) > >> > [dal.jar:] > >> > at > >> > > >> > org.ovirt.engine.core.vdsbroker.vdsbroker.DefaultVdsCommandExecutor. > execute(DefaultVdsCommandExecutor.java:14) > >> > [vdsbroker.jar:] > >> > at > >> > > >> > org.ovirt.engine.core.vdsbroker.ResourceManager. > runVdsCommand(ResourceManager.java:408) > >> > [vdsbroker.jar:] > >> > at > >> > > >> > org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy. > proceedStoragePoolStats(IrsProxy.java:348) > >> > [vdsbroker.jar:] > >> > at > >> > > >> > org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy. > lambda$updatingTimerElapsed$0(IrsProxy.java:246) > >> > [vdsbroker.jar:] > >> > at > >> > > >> > org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy. > runInControlledConcurrency(IrsProxy.java:262) > >> > [vdsbroker.jar:] > >> > at > >> > > >> > org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy. > updatingTimerElapsed(IrsProxy.java:227) > >> > [vdsbroker.jar:] > >> > at sun.reflect.GeneratedMethodAccessor191.invoke(Unknown Source) > >> > [:1.8.0_111] > >> > at > >> > > >> > sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > >> > [rt.jar:1.8.0_111] > >> > at java.lang.reflect.Method.invoke(Method.java:498) > [rt.jar:1.8.0_111] > >> > at > >> > > >> > org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod( > JobWrapper.java:77) > >> > [scheduler.jar:] > >> > at > >> > org.ovirt.engine.core.utils.timer.JobWrapper.execute( > JobWrapper.java:51) > >> > [scheduler.jar:] > >> > at org.quartz.core.JobRunShell.run(JobRunShell.java:213) > [quartz.jar:] > >> > at > >> > java.util.concurrent.Executors$RunnableAdapter. > call(Executors.java:511) > >> > [rt.jar:1.8.0_111] > >> > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > >> > [rt.jar:1.8.0_111] > >> > at > >> > > >> > java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1142) > >> > [rt.jar:1.8.0_111] > >> > at > >> > > >> > java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:617) > >> > [rt.jar:1.8.0_111] > >> > at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_111] > >> > > >> > >> This exception occurs when a response do not arrive within specified > >> period of time. > >> > >> Here we can have potential reason for it: > >> > >> 2017-01-11 03:23:17,944 ERROR (jsonrpc/5) [storage.TaskManager.Task] > >> (Task='db7a84ba-d89c-4ac7-ab7a-b9f409ea7365') Unexpected error > >> (task:870) > >> Traceback (most recent call last): > >> File "/usr/share/vdsm/storage/task.py", line 877, in _run > >> return fn(*args, **kargs) > >> File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in > >> wrapper > >> res = f(*args, **kwargs) > >> File "/usr/share/vdsm/storage/hsm.py", line 3054, in getVolumeInfo > >> volUUID=volUUID).getInfo() > >> File "/usr/share/vdsm/storage/sd.py", line 748, in produceVolume > >> volUUID) > >> File "/usr/share/vdsm/storage/blockVolume.py", line 415, in __init__ > >> manifest = self.manifestClass(repoPath, sdUUID, imgUUID, volUUID) > >> File "/usr/share/vdsm/storage/blockVolume.py", line 69, in __init__ > >> volUUID) > >> File "/usr/share/vdsm/storage/volume.py", line 84, in __init__ > >> self.validate() > >> File "/usr/share/vdsm/storage/blockVolume.py", line 159, in validate > >> raise se.VolumeDoesNotExist(self.volUUID) > >> VolumeDoesNotExist: Volume does not exist: > >> (u'5a464296-ebea-4d1f-a299-cec45e82f9f3',) > > > > > > > > If that is the case then we should reply with an error on this api call > and > > not ignore and let the time out trigger. Do you think this what happen? > > Reply to what? Let me say how I see it. > > We send to vdsm a request which takes a bit of time to process. During > processing > there was an issue that we had to reset the connection which reset > tracking and buffers. > What issue? I'm concerned about those unexplained disconnections. MOM from VDSM and Engine from VDSM. Y. > Once response arrive we have no information about it. > > When the connection is reset we let the command code know that the > issue occurred > so it is up to the code to handle the issue - retry or fail. > > >> > >> > >> > > >> > Attached is a zip file with all artifacts from Jenkins. > >> > > >> > The error I've mentioned above is found in: > >> > > >> > > >> > exported-artifacts/test_logs/basic-suite-master/post-004_ > basic_sanity.py/lago-basic-suite-master-engine/_var_log_ > ovirt-engine/engine.log > >> > > >> > Can some one advise? > >> > > >> > Thanks, > >> > -- > >> > Daniel Belenky > >> > RHV DevOps > >> > Red Hat Israel > >> > > >> > _______________________________________________ > >> > Devel mailing list > >> > [email protected] > >> > http://lists.ovirt.org/mailman/listinfo/devel > >> _______________________________________________ > >> Devel mailing list > >> [email protected] > >> http://lists.ovirt.org/mailman/listinfo/devel > > > > > _______________________________________________ > Devel mailing list > [email protected] > http://lists.ovirt.org/mailman/listinfo/devel >
_______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
