Re: [ovirt-users] Host Issue
Hi, VDSM error 99 means RecoveryInProgress and it might take some time depending on how many VMs there are. So I suggest you wait a bit more for now and see what happens. Best regards -- Martin Sivak oVirt / SLA On Thu, Feb 2, 2017 at 4:18 PM, Bryan Sockelwrote: > Hi, > > Came into the office with an issue with my ovirt setup this morning. On one > of my hosts the / partition was completely full causing the host to go into > an unknown state. I was able to clear out some space for the time being and > attempting to recover my that host. VM's are still running and responding > on the host. > > I am using Gluster volumes in my configuration, and had to restart gluster > service on that host. I also restarted the ovirt-ha-agent service. > > I am seeing this entry in my agent.log every two seconds: > > MainThread::INFO::2017-02-02 > 09:11:19,606::util::214::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc) > Waiting for VDSM hardware info > > In my vdsm.log i am seeing this > jsonrpc.Executor/4::INFO::2017-02-02 > 09:13:42,088::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In > recovery, ignoring 'Host.getAllVmStats' in bridge with {} > jsonrpc.Executor/4::INFO::2017-02-02 > 09:13:42,088::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call > Host.getAllVmStats failed (error 99) in 0.00 seconds > jsonrpc.Executor/5::INFO::2017-02-02 > 09:13:42,114::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In > recovery, ignoring 'Host.getHardwareInfo' in bridge with {} > jsonrpc.Executor/5::INFO::2017-02-02 > 09:13:42,115::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call > Host.getHardwareInfo failed (error 99) in 0.00 seconds > jsonrpc.Executor/6::INFO::2017-02-02 > 09:13:44,121::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In > recovery, ignoring 'Host.getHardwareInfo' in bridge with {} > jsonrpc.Executor/6::INFO::2017-02-02 > 09:13:44,122::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call > Host.getHardwareInfo failed (error 99) in 0.00 seconds > jsonrpc.Executor/7::INFO::2017-02-02 > 09:13:46,127::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In > recovery, ignoring 'Host.getHardwareInfo' in bridge with {} > jsonrpc.Executor/7::INFO::2017-02-02 > 09:13:46,127::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call > Host.getHardwareInfo failed (error 99) in 0.00 seconds > clientIFinit::DEBUG::2017-02-02 > 09:13:46,257::task::597::Storage.TaskManager.Task::(_updateState) > Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::moving from state init -> state > preparing > clientIFinit::INFO::2017-02-02 > 09:13:46,258::logUtils::49::dispatcher::(wrapper) Run and protect: > getConnectedStoragePoolsList(options=None) > clientIFinit::INFO::2017-02-02 > 09:13:46,258::logUtils::52::dispatcher::(wrapper) Run and protect: > getConnectedStoragePoolsList, Return response: {'poollist': []} > clientIFinit::DEBUG::2017-02-02 > 09:13:46,258::task::1193::Storage.TaskManager.Task::(prepare) > Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::finished: {'poollist': []} > clientIFinit::DEBUG::2017-02-02 > 09:13:46,258::task::597::Storage.TaskManager.Task::(_updateState) > Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::moving from state preparing -> > state finished > clientIFinit::DEBUG::2017-02-02 > 09:13:46,258::resourceManager::952::Storage.ResourceManager.Owner::(releaseAll) > Owner.releaseAll requests {} resources {} > clientIFinit::DEBUG::2017-02-02 > 09:13:46,259::resourceManager::989::Storage.ResourceManager.Owner::(cancelAll) > Owner.cancelAll requests {} > clientIFinit::DEBUG::2017-02-02 > 09:13:46,259::task::995::Storage.TaskManager.Task::(_decref) > Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::ref 0 aborting False > clientIFinit::INFO::2017-02-02 > 09:13:46,259::clientIF::558::vds::(_waitForStoragePool) recovery: waiting > for storage pool to go up > jsonrpc.Executor/0::INFO::2017-02-02 > 09:13:48,133::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In > recovery, ignoring 'Host.getHardwareInfo' in bridge with {} > jsonrpc.Executor/0::INFO::2017-02-02 > 09:13:48,134::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call > Host.getHardwareInfo failed (error 99) in 0.00 seconds > jsonrpc.Executor/1::INFO::2017-02-02 > 09:13:50,140::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In > recovery, ignoring 'Host.getHardwareInfo' in bridge with {} > jsonrpc.Executor/1::INFO::2017-02-02 > 09:13:50,140::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call > Host.getHardwareInfo failed (error 99) in 0.00 seconds > clientIFinit::DEBUG::2017-02-02 > 09:13:51,265::task::597::Storage.TaskManager.Task::(_updateState) > Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::moving from state init -> state > preparing > clientIFinit::INFO::2017-02-02 > 09:13:51,265::logUtils::49::dispatcher::(wrapper) Run and protect: > getConnectedStoragePoolsList(options=None) >
[ovirt-users] Host Issue
Hi, Came into the office with an issue with my ovirt setup this morning. On one of my hosts the / partition was completely full causing the host to go into an unknown state. I was able to clear out some space for the time being and attempting to recover my that host. VM's are still running and responding on the host. I am using Gluster volumes in my configuration, and had to restart gluster service on that host. I also restarted the ovirt-ha-agent service. I am seeing this entry in my agent.log every two seconds: MainThread::INFO::2017-02-02 09:11:19,606::util::214::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc) Waiting for VDSM hardware info In my vdsm.log i am seeing this jsonrpc.Executor/4::INFO::2017-02-02 09:13:42,088::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getAllVmStats' in bridge with {} jsonrpc.Executor/4::INFO::2017-02-02 09:13:42,088::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getAllVmStats failed (error 99) in 0.00 seconds jsonrpc.Executor/5::INFO::2017-02-02 09:13:42,114::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/5::INFO::2017-02-02 09:13:42,115::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds jsonrpc.Executor/6::INFO::2017-02-02 09:13:44,121::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/6::INFO::2017-02-02 09:13:44,122::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds jsonrpc.Executor/7::INFO::2017-02-02 09:13:46,127::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/7::INFO::2017-02-02 09:13:46,127::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds clientIFinit::DEBUG::2017-02-02 09:13:46,257::task::597::Storage.TaskManager.Task::(_updateState) Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::moving from state init -> state preparing clientIFinit::INFO::2017-02-02 09:13:46,258::logUtils::49::dispatcher::(wrapper) Run and protect: getConnectedStoragePoolsList(options=None) clientIFinit::INFO::2017-02-02 09:13:46,258::logUtils::52::dispatcher::(wrapper) Run and protect: getConnectedStoragePoolsList, Return response: {'poollist': []} clientIFinit::DEBUG::2017-02-02 09:13:46,258::task::1193::Storage.TaskManager.Task::(prepare) Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::finished: {'poollist': []} clientIFinit::DEBUG::2017-02-02 09:13:46,258::task::597::Storage.TaskManager.Task::(_updateState) Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::moving from state preparing -> state finished clientIFinit::DEBUG::2017-02-02 09:13:46,258::resourceManager::952::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} clientIFinit::DEBUG::2017-02-02 09:13:46,259::resourceManager::989::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} clientIFinit::DEBUG::2017-02-02 09:13:46,259::task::995::Storage.TaskManager.Task::(_decref) Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::ref 0 aborting False clientIFinit::INFO::2017-02-02 09:13:46,259::clientIF::558::vds::(_waitForStoragePool) recovery: waiting for storage pool to go up jsonrpc.Executor/0::INFO::2017-02-02 09:13:48,133::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/0::INFO::2017-02-02 09:13:48,134::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds jsonrpc.Executor/1::INFO::2017-02-02 09:13:50,140::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In recovery, ignoring 'Host.getHardwareInfo' in bridge with {} jsonrpc.Executor/1::INFO::2017-02-02 09:13:50,140::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call Host.getHardwareInfo failed (error 99) in 0.00 seconds clientIFinit::DEBUG::2017-02-02 09:13:51,265::task::597::Storage.TaskManager.Task::(_updateState) Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::moving from state init -> state preparing clientIFinit::INFO::2017-02-02 09:13:51,265::logUtils::49::dispatcher::(wrapper) Run and protect: getConnectedStoragePoolsList(options=None) clientIFinit::INFO::2017-02-02 09:13:51,265::logUtils::52::dispatcher::(wrapper) Run and protect: getConnectedStoragePoolsList, Return response: {'poollist': []} clientIFinit::DEBUG::2017-02-02 09:13:51,265::task::1193::Storage.TaskManager.Task::(prepare) Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::finished: {'poollist': []} clientIFinit::DEBUG::2017-02-02 09:13:51,266::task::597::Storage.TaskManager.Task::(_updateState)