Re: [ovirt-users] Host Issue

2017-02-02 Thread Martin Sivak
Hi,

VDSM error 99 means RecoveryInProgress and it might take some time
depending on how many VMs there are.

So I suggest you wait a bit more for now and see what happens.

Best regards

--
Martin Sivak
oVirt / SLA

On Thu, Feb 2, 2017 at 4:18 PM, Bryan Sockel  wrote:
>  Hi,
>
> Came into the office with an issue with my ovirt setup this morning.  On one
> of my hosts the / partition was completely full causing the host to go into
> an unknown state.  I was able to clear out some space for the time being and
> attempting to recover my that host.  VM's are still running and responding
> on the host.
>
> I am using Gluster volumes in my configuration, and had to restart gluster
> service on that host.  I also restarted the ovirt-ha-agent service.
>
> I am seeing this entry in my agent.log every two seconds:
>
> MainThread::INFO::2017-02-02
> 09:11:19,606::util::214::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc)
> Waiting for VDSM hardware info
>
> In my vdsm.log i am seeing this
> jsonrpc.Executor/4::INFO::2017-02-02
> 09:13:42,088::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In
> recovery, ignoring 'Host.getAllVmStats' in bridge with {}
> jsonrpc.Executor/4::INFO::2017-02-02
> 09:13:42,088::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call
> Host.getAllVmStats failed (error 99) in 0.00 seconds
> jsonrpc.Executor/5::INFO::2017-02-02
> 09:13:42,114::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In
> recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
> jsonrpc.Executor/5::INFO::2017-02-02
> 09:13:42,115::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call
> Host.getHardwareInfo failed (error 99) in 0.00 seconds
> jsonrpc.Executor/6::INFO::2017-02-02
> 09:13:44,121::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In
> recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
> jsonrpc.Executor/6::INFO::2017-02-02
> 09:13:44,122::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call
> Host.getHardwareInfo failed (error 99) in 0.00 seconds
> jsonrpc.Executor/7::INFO::2017-02-02
> 09:13:46,127::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In
> recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
> jsonrpc.Executor/7::INFO::2017-02-02
> 09:13:46,127::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call
> Host.getHardwareInfo failed (error 99) in 0.00 seconds
> clientIFinit::DEBUG::2017-02-02
> 09:13:46,257::task::597::Storage.TaskManager.Task::(_updateState)
> Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::moving from state init -> state
> preparing
> clientIFinit::INFO::2017-02-02
> 09:13:46,258::logUtils::49::dispatcher::(wrapper) Run and protect:
> getConnectedStoragePoolsList(options=None)
> clientIFinit::INFO::2017-02-02
> 09:13:46,258::logUtils::52::dispatcher::(wrapper) Run and protect:
> getConnectedStoragePoolsList, Return response: {'poollist': []}
> clientIFinit::DEBUG::2017-02-02
> 09:13:46,258::task::1193::Storage.TaskManager.Task::(prepare)
> Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::finished: {'poollist': []}
> clientIFinit::DEBUG::2017-02-02
> 09:13:46,258::task::597::Storage.TaskManager.Task::(_updateState)
> Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::moving from state preparing ->
> state finished
> clientIFinit::DEBUG::2017-02-02
> 09:13:46,258::resourceManager::952::Storage.ResourceManager.Owner::(releaseAll)
> Owner.releaseAll requests {} resources {}
> clientIFinit::DEBUG::2017-02-02
> 09:13:46,259::resourceManager::989::Storage.ResourceManager.Owner::(cancelAll)
> Owner.cancelAll requests {}
> clientIFinit::DEBUG::2017-02-02
> 09:13:46,259::task::995::Storage.TaskManager.Task::(_decref)
> Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::ref 0 aborting False
> clientIFinit::INFO::2017-02-02
> 09:13:46,259::clientIF::558::vds::(_waitForStoragePool) recovery: waiting
> for storage pool to go up
> jsonrpc.Executor/0::INFO::2017-02-02
> 09:13:48,133::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In
> recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
> jsonrpc.Executor/0::INFO::2017-02-02
> 09:13:48,134::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call
> Host.getHardwareInfo failed (error 99) in 0.00 seconds
> jsonrpc.Executor/1::INFO::2017-02-02
> 09:13:50,140::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In
> recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
> jsonrpc.Executor/1::INFO::2017-02-02
> 09:13:50,140::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call
> Host.getHardwareInfo failed (error 99) in 0.00 seconds
> clientIFinit::DEBUG::2017-02-02
> 09:13:51,265::task::597::Storage.TaskManager.Task::(_updateState)
> Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::moving from state init -> state
> preparing
> clientIFinit::INFO::2017-02-02
> 09:13:51,265::logUtils::49::dispatcher::(wrapper) Run and protect:
> getConnectedStoragePoolsList(options=None)
> clientIFinit::INFO::2017-02-02
> 09:13:

[ovirt-users] Host Issue

2017-02-02 Thread Bryan Sockel
Hi,

Came into the office with an issue with my ovirt setup this morning.  On one 
of my hosts the / partition was completely full causing the host to go into 
an unknown state.  I was able to clear out some space for the time being and 
attempting to recover my that host.  VM's are still running and responding 
on the host.

I am using Gluster volumes in my configuration, and had to restart gluster 
service on that host.  I also restarted the ovirt-ha-agent service.

I am seeing this entry in my agent.log every two seconds:

MainThread::INFO::2017-02-02 
09:11:19,606::util::214::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc)
 
Waiting for VDSM hardware info

In my vdsm.log i am seeing this
jsonrpc.Executor/4::INFO::2017-02-02 
09:13:42,088::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In 
recovery, ignoring 'Host.getAllVmStats' in bridge with {}
jsonrpc.Executor/4::INFO::2017-02-02 
09:13:42,088::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call 
Host.getAllVmStats failed (error 99) in 0.00 seconds
jsonrpc.Executor/5::INFO::2017-02-02 
09:13:42,114::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In 
recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
jsonrpc.Executor/5::INFO::2017-02-02 
09:13:42,115::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call 
Host.getHardwareInfo failed (error 99) in 0.00 seconds
jsonrpc.Executor/6::INFO::2017-02-02 
09:13:44,121::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In 
recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
jsonrpc.Executor/6::INFO::2017-02-02 
09:13:44,122::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call 
Host.getHardwareInfo failed (error 99) in 0.00 seconds
jsonrpc.Executor/7::INFO::2017-02-02 
09:13:46,127::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In 
recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
jsonrpc.Executor/7::INFO::2017-02-02 
09:13:46,127::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call 
Host.getHardwareInfo failed (error 99) in 0.00 seconds
clientIFinit::DEBUG::2017-02-02 
09:13:46,257::task::597::Storage.TaskManager.Task::(_updateState) 
Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::moving from state init -> state 
preparing
clientIFinit::INFO::2017-02-02 
09:13:46,258::logUtils::49::dispatcher::(wrapper) Run and protect: 
getConnectedStoragePoolsList(options=None)
clientIFinit::INFO::2017-02-02 
09:13:46,258::logUtils::52::dispatcher::(wrapper) Run and protect: 
getConnectedStoragePoolsList, Return response: {'poollist': []}
clientIFinit::DEBUG::2017-02-02 
09:13:46,258::task::1193::Storage.TaskManager.Task::(prepare) 
Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::finished: {'poollist': []}
clientIFinit::DEBUG::2017-02-02 
09:13:46,258::task::597::Storage.TaskManager.Task::(_updateState) 
Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::moving from state preparing -> 
state finished
clientIFinit::DEBUG::2017-02-02 
09:13:46,258::resourceManager::952::Storage.ResourceManager.Owner::(releaseAll) 
Owner.releaseAll requests {} resources {}
clientIFinit::DEBUG::2017-02-02 
09:13:46,259::resourceManager::989::Storage.ResourceManager.Owner::(cancelAll) 
Owner.cancelAll requests {}
clientIFinit::DEBUG::2017-02-02 
09:13:46,259::task::995::Storage.TaskManager.Task::(_decref) 
Task=`ba88b701-9f7a-488e-9c80-3d61cec38053`::ref 0 aborting False
clientIFinit::INFO::2017-02-02 
09:13:46,259::clientIF::558::vds::(_waitForStoragePool) recovery: waiting 
for storage pool to go up
jsonrpc.Executor/0::INFO::2017-02-02 
09:13:48,133::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In 
recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
jsonrpc.Executor/0::INFO::2017-02-02 
09:13:48,134::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call 
Host.getHardwareInfo failed (error 99) in 0.00 seconds
jsonrpc.Executor/1::INFO::2017-02-02 
09:13:50,140::__init__::525::jsonrpc.JsonRpcServer::(_handle_request) In 
recovery, ignoring 'Host.getHardwareInfo' in bridge with {}
jsonrpc.Executor/1::INFO::2017-02-02 
09:13:50,140::__init__::513::jsonrpc.JsonRpcServer::(_serveRequest) RPC call 
Host.getHardwareInfo failed (error 99) in 0.00 seconds
clientIFinit::DEBUG::2017-02-02 
09:13:51,265::task::597::Storage.TaskManager.Task::(_updateState) 
Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::moving from state init -> state 
preparing
clientIFinit::INFO::2017-02-02 
09:13:51,265::logUtils::49::dispatcher::(wrapper) Run and protect: 
getConnectedStoragePoolsList(options=None)
clientIFinit::INFO::2017-02-02 
09:13:51,265::logUtils::52::dispatcher::(wrapper) Run and protect: 
getConnectedStoragePoolsList, Return response: {'poollist': []}
clientIFinit::DEBUG::2017-02-02 
09:13:51,265::task::1193::Storage.TaskManager.Task::(prepare) 
Task=`e8c558b1-f5d0-49ea-ac92-51660d03636e`::finished: {'poollist': []}
clientIFinit::DEBUG::2017-02-02 
09:13:51,266::task::597::Storage.TaskManager.Task::(_updateState) 
Ta