On Tuesday, 21 September 2021 11:29:06 CEST manoj.sharma99...@gmail.com wrote:
> Thanks Vojtech Juranek for quick response please find logs below
> We got vm non responsive and at the same time vdsm log shows "monitor became
> unresponsive (command timeout, age=61.88649625200196) (vm:5944)" 
 
This means that thread which collects statistics about VM didn't get a 
response in expected time. Does VM response? You can also investigate VM using 
virsh. Or maybe overloaded host which makes all operations very slow and thus 
time out?

> Please refer below logs for more details.
> 
> [root@host vdsm]# cat vdsm.log | grep WARN
> 2021-09-21 14:30:19,487+0530 WARN  (vdsm.Scheduler) [Executor] executor
> state: count=5 workers={<Worker name=periodic/492 running <Task discardable
> <Operation action=<vdsm.virt.sampling.VMBulkstatsMonitor object at
> 0x7f025051dc18> at 0x7f02502ff5f8> timeout=7.5, duration=7.50 at
> 0x7f023860d6d8> discarded task#=2334 at 0x7f02709edc88>, <Worker
> name=periodic/493 waiting task#=0 at 0x7f02381016a0>, <Worker
> name=periodic/489 waiting task#=2701 at 0x7f02386faef0>, <Worker
> name=periodic/491 waiting task#=1807 at 0x7f0270903908>, <Worker
> name=periodic/483 waiting task#=2144 at 0x7f0270b12940>} (executor:213)
> 2021-09-21 14:30:34,487+0530 WARN  (vdsm.Scheduler) [Executor] executor
> state: count=5 workers={<Worker name=periodic/493 waiting task#=3 at
> 0x7f02381016a0>, <Worker name=periodic/489 waiting task#=2704 at
> 0x7f02386faef0>, <Worker name=periodic/491 waiting task#=1809 at
> 0x7f0270903908>, <Worker name=periodic/483 running <Task discardable
> <Operation action=<vdsm.virt.sampling.VMBulkstatsMonitor object at
> 0x7f025051dc18> at 0x7f02502ff5f8> timeout=7.5, duration=7.50 at
> 0x7f02384a7780> discarded task#=2146 at 0x7f0270b12940>, <Worker
> name=periodic/494 waiting task#=0 at 0x7f02386d2f60>} (executor:213)
> 2021-09-21 14:30:49,844+0530 WARN  (vdsm.Scheduler) [Executor] executor
> state: count=5 workers={<Worker name=periodic/493 waiting task#=15 at
> 0x7f02381016a0>, <Worker name=periodic/489 waiting task#=2717 at
> 0x7f02386faef0>, <Worker name=periodic/491 running <Task discardable
> <Operation action=<vdsm.virt.sampling.VMBulkstatsMonitor object at
> 0x7f025051dc18> at 0x7f02502ff5f8> timeout=7.5, duration=7.50 at
> 0x7f02147e0ac8> discarded task#=1897 at 0x7f0270903908>, <Worker
> name=periodic/495 waiting task#=0 at 0x7f023847d550>, <Worker
> name=periodic/494 waiting task#=12 at 0x7f02386d2f60>} (executor:213)
> 2021-09-21 14:31:04,491+0530 WARN  (vdsm.Scheduler) [Executor] executor
> state: count=5 workers={<Worker name=periodic/496 waiting task#=0 at
> 0x7f0270c70c18>, <Worker name=periodic/493 waiting task#=17 at
> 0x7f02381016a0>, <Worker name=periodic/489 running <Task discardable
> <Operation action=<vdsm.virt.sampling.VMBulkstatsMonitor object at
> 0x7f025051dc18> at 0x7f02502ff5f8> timeout=7.5, duration=7.50 at
> 0x7f0238494240> discarded task#=2719 at 0x7f02386faef0>, <Worker
> name=periodic/495 waiting task#=2 at 0x7f023847d550>, <Worker
> name=periodic/494 waiting task#=15 at 0x7f02386d2f60>} (executor:213)
> 2021-09-21 14:31:19,491+0530 WARN  (vdsm.Scheduler) [Executor] executor
> state: count=5 workers={<Worker name=periodic/496 waiting task#=2 at
> 0x7f0270c70c18>, <Worker name=periodic/497 waiting task#=0 at
> 0x7f0238756828>, <Worker name=periodic/493 waiting task#=20 at
> 0x7f02381016a0>, <Worker name=periodic/495 waiting task#=6 at
> 0x7f023847d550>, <Worker name=periodic/494 running <Task discardable
> <Operation action=<vdsm.virt.sampling.VMBulkstatsMonitor object at
> 0x7f025051dc18> at 0x7f02502ff5f8> timeout=7.5, duration=7.50 at
> 0x7f02386182e8> discarded task#=17 at 0x7f02386d2f60>} (executor:213)
> 2021-09-21 14:31:34,492+0530 WARN  (vdsm.Scheduler) [Executor] executor
> state: count=5 workers={<Worker name=periodic/496 waiting task#=4 at
> 0x7f0270c70c18>, <Worker name=periodic/497 waiting task#=3 at
> 0x7f0238756828>, <Worker name=periodic/498 waiting task#=0 at
> 0x7f02384aca58>, <Worker name=periodic/493 running <Task discardable
> <Operation action=<vdsm.virt.sampling.VMBulkstatsMonitor object at
> 0x7f025051dc18> at 0x7f02502ff5f8> timeout=7.5, duration=7.50 at
> 0x7f02380652e8> discarded task#=22 at 0x7f02381016a0>, <Worker
> name=periodic/495 waiting task#=8 at 0x7f023847d550>} (executor:213)
> 2021-09-21 14:31:42,418+0530 WARN  (jsonrpc/5) [virt.vm]
> (vmId='b994fb6c-4a7d-469a-82fb-1d776e9f7ae9') monitor became unresponsive
> (command timeout, age=60.07553722700686) (vm:5944) 2021-09-21
> 14:31:49,492+0530 WARN  (vdsm.Scheduler) [Executor] executor state: count=5
> workers={<Worker name=periodic/496 waiting task#=93 at 0x7f0270c70c18>,
> <Worker name=periodic/497 waiting task#=14 at 0x7f0238756828>, <Worker
> name=periodic/498 waiting task#=12 at 0x7f02384aca58>, <Worker
> name=periodic/499 waiting task#=0 at 0x7f0270d3d710>, <Worker
> name=periodic/495 running <Task discardable <Operation
> action=<vdsm.virt.sampling.VMBulkstatsMonitor object at 0x7f025051dc18> at
> 0x7f02502ff5f8> timeout=7.5, duration=7.50 at 0x7f02384768d0> discarded
> task#=22 at 0x7f023847d550>} (executor:213) 2021-09-21 14:32:19,492+0530
> WARN  (vdsm.Scheduler) [Executor] executor state: count=5 workers={<Worker
> name=periodic/496 running <Task discardable <Operation
> action=<vdsm.virt.sampling.VMBulkstatsMonitor object at 0x7f025051dc18> at
> 0x7f02502ff5f8> timeout=7.5, duration=7.50 at 0x7f027098a400> discarded
> task#=97 at 0x7f0270c70c18>, <Worker name=periodic/497 waiting task#=20 at
> 0x7f0238756828>, <Worker name=periodic/498 waiting task#=17 at
> 0x7f02384aca58>, <Worker name=periodic/500 waiting task#=0 at
> 0x7f0270ff02b0>, <Worker name=periodic/499 waiting task#=5 at
> 0x7f0270d3d710>} (executor:213) 2021-09-21 14:33:19,494+0530 WARN 
> (vdsm.Scheduler) [Executor] executor state: count=5 workers={<Worker
> name=periodic/497 waiting task#=115 at 0x7f0238756828>, <Worker
> name=periodic/498 waiting task#=40 at 0x7f02384aca58>, <Worker
> name=periodic/500 waiting task#=22 at 0x7f0270ff02b0>, <Worker
> name=periodic/499 running <Task discardable <Operation
> action=<vdsm.virt.sampling.VMBulkstatsMonitor object at 0x7f025051dc18> at
> 0x7f02502ff5f8> timeout=7.5, duration=7.50 at 0x7f0270d25be0> discarded
> task#=23 at 0x7f0270d3d710>, <Worker name=periodic/501 waiting task#=0 at
> 0x7f02709ebdd8>} (executor:213) 2021-09-21 14:33:34,494+0530 WARN 
> (vdsm.Scheduler) [Executor] executor state: count=5 workers={<Worker
> name=periodic/497 waiting task#=117 at 0x7f0238756828>, <Worker
> name=periodic/498 running <Task discardable <Operation
> action=<vdsm.virt.sampling.VMBulkstatsMonitor object at 0x7f025051dc18> at
> 0x7f02502ff5f8> timeout=7.5, duration=7.50 at 0x7f0270c4b198> discarded
> task#=42 at 0x7f02384aca58>, <Worker name=periodic/500 waiting task#=25 at
> 0x7f0270ff02b0>, <Worker name=periodic/502 waiting task#=0 at
> 0x7f0270d17978>, <Worker name=periodic/501 waiting task#=2 at
> 0x7f02709ebdd8>} (executor:213) 2021-09-21 14:33:40,755+0530 WARN 
> (periodic/502) [virt.periodic.VmDispatcher] could not run <class
> 'vdsm.virt.periodic.UpdateVolumes'> on
> ['606a03d5-4a1a-40bc-94dc-d455bc2fe950'] (periodic:290) 2021-09-21
> 14:33:50,813+0530 WARN  (vdsm.Scheduler) [Executor] executor state: count=5
> workers={<Worker name=periodic/497 running <Task discardable <Operation
> action=<vdsm.virt.sampling.VMBulkstatsMonitor object at 0x7f025051dc18> at
> 0x7f02502ff5f8> timeout=7.5, duration=7.50 at 0x7f0270a87dd8> discarded
> task#=129 at 0x7f0238756828>, <Worker name=periodic/500 waiting task#=37 at
> 0x7f0270ff02b0>, <Worker name=periodic/503 waiting task#=0 at
> 0x7f02709f3ef0>, <Worker name=periodic/502 waiting task#=13 at
> 0x7f0270d17978>, <Worker name=periodic/501 waiting task#=90 at
> 0x7f02709ebdd8>} (executor:213) 2021-09-21 14:34:28,880+0530 WARN 
> (jsonrpc/2) [virt.vm] (vmId='968aae88-8bca-4800-be9f-38067b73f271') monitor
> became unresponsive (command timeout, age=61.88649625200196) (vm:5944)
> 2021-09-21 14:40:34,503+0530 WARN  (vdsm.Scheduler) [Executor] executor
> state: count=5 workers={<Worker name=periodic/504 waiting task#=0 at
> 0x7f0270faaa58>, <Worker name=periodic/500 waiting task#=245 at
> 0x7f0270ff02b0>, <Worker name=periodic/503 running <Task discardable
> <Operation action=<vdsm.virt.sampling.VMBulkstatsMonitor object at
> 0x7f025051dc18> at 0x7f02502ff5f8> timeout=7.5, duration=7.50 at
> 0x7f02386ada20> discarded task#=217 at 0x7f02709f3ef0>, <Worker
> name=periodic/502 waiting task#=359 at 0x7f0270d17978>, <Worker
> name=periodic/501 waiting task#=301 at 0x7f02709ebdd8>} (executor:213) 
> 
> Not able to understand the cause and required fix please help
> _______________________________________________
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/ List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/AJWT7KOCVJP6C
> M42MG7Y7W5ZPOSSZRDZ/

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZJRRZZHHE33DOVVJW3JB4BYJ5TOU3A3A/

Reply via email to