Re: [ovirt-users] 3.5 to 3.6 upgrade stuck

2016-07-21 Thread Robert Story
On Thu, 21 Jul 2016 14:43:50 -0400 Robert wrote:
RS> So after some debugging with Simone on irc, we've determined that the issue
RS> is the agent timing out trying to communicate with the broker. The problem
RS> is that we have no idea why.

So more detail attached. The agent is sending:

   
MainThread::hosted_engine::436::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
  ::(start_monitoring) Processing engine state 
 
MainThread::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink
  ::(notify) Trying: notify time=1469129518.85 type=state_transition 
detail=StartState-ReinitializeFSM hostname='poseidon.netsec' 
MainThread::brokerlink::273::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink
  ::(_communicate) Sending request: notify time=1469129518.85 
type=state_transition detail=StartState-ReinitializeFSM 
hostname='poseidon.netsec'

Which the broker sees:

Thread-1::util::69::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler
 ::(socket_readline) socket_readline in blocking mode
Thread-1::listener::163::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler
 ::(handle) Input: notify time=1469129518.85 type=state_transition 
detail=StartState-ReinitializeFSM hostname='poseidon.netsec'

It then refreshes the local config file:

Thread-1::config::251::ovirt_hosted_engine_ha.broker.notifications.Notifications.config
 ::(refresh_local_conf_file) Reading 'broker.conf' from 
'/rhev/data-center/mnt/ovirt-nfs.netsec:_ovirt_hosted-engine/2daba0ab-2b3d-4026-bcfc-1cd071c30038/images/a04a45b9-e780-4104-ad4b-d5901a5490c4/34a7

Which succeeds:

Thread-1::config::271::ovirt_hosted_engine_ha.broker.notifications.Notifications.config
 ::(refresh_local_conf_file) Writing to 
'/var/lib/ovirt-hosted-engine-ha/broker.conf'
Thread-1::config::278::ovirt_hosted_engine_ha.broker.notifications.Notifications.config
 ::(refresh_local_conf_file) local conf file was correctly written

And then  nothing. It just hangs. Nothing more is logged Thread-1.



Robert

-- 
Senior Software Engineer @ Parsons



Robert

-- 
Senior Software Engineer @ Parsons

MainThread::DEBUG::2016-07-21 15:31:58,847::hosted_engine::436::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Processing engine state 
MainThread::INFO::2016-07-21 15:31:58,847::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Trying: notify time=1469129518.85 type=state_transition detail=StartState-ReinitializeFSM hostname='poseidon.netsec'
MainThread::DEBUG::2016-07-21 15:31:58,847::brokerlink::273::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(_communicate) Sending request: notify time=1469129518.85 type=state_transition detail=StartState-ReinitializeFSM hostname='poseidon.netsec'
MainThread::DEBUG::2016-07-21 15:31:58,848::util::77::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(socket_readline) socket_readline with 30.0 seconds timeout
MainThread::DEBUG::2016-07-21 15:32:28,866::util::88::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(socket_readline) Connection timeout while reading from socket
MainThread::ERROR::2016-07-21 15:32:28,867::brokerlink::279::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(_communicate) Connection closed: Connection timed out
MainThread::DEBUG::2016-07-21 15:32:28,867::brokerlink::86::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(disconnect) Closing connection to ha-broker
MainThread::ERROR::2016-07-21 15:32:28,867::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: 'Failed to start monitor state_transition, options {'hostname': 'poseidon.netsec'}: Connection timed out' - trying to restart agent


Thread-1::util::69::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler
 ::(socket_readline) socket_readline in blocking mode
Thread-1::listener::163::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler
 ::(handle) Input: notify time=1469129518.85 type=state_transition detail=StartState-ReinitializeFSM hostname='poseidon.netsec'
Thread-1::listener::238::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler
 ::(_dispatch) Request type notify from 139793244509952
Thread-1::notifications::46::ovirt_hosted_engine_ha.broker.notifications.Notifications
 ::(notify) nofity: {'hostname': 'poseidon.netsec', 'type': 'state_transition', 'detail': 'StartState-ReinitializeFSM', 'time': '1469129518.85'}
Thread-1::config::251::ovirt_hosted_engine_ha.broker.notifications.Notifications.config
 ::(refresh_local_conf_file) Reading 'broker.conf' from 

Re: [ovirt-users] 3.5 to 3.6 upgrade stuck

2016-07-21 Thread Robert Story
So after some debugging with Simone on irc, we've determined that the issue
is the agent timing out trying to communicate with the broker. The problem
is that we have no idea why.

Thread-942::INFO::2016-07-21
09:19:51,934::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
Connection established Thread-942::INFO::2016-07-21
09:19:51,936::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
Connection closed Thread-943::INFO::2016-07-21
09:19:53,099::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
Connection established Thread-943::INFO::2016-07-21
09:19:53,554::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
Connection closed

Thread-135::DEBUG::2016-07-21
09:47:34,941::util::69::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(socket_readline)
socket_readline in blocking mode Thread-135::DEBUG::2016-07-21
09:47:34,941::util::99::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(socket_readline)
Connection closed while reading from socket


So I tried to reinstall instead:

- host -> maintenance
- host removed from cluster
- yum remove ovirt\*
- yum install ovirt-hosted-engine-setup
- hosted-engine --deploy
  - chose new node id
  - reused same name/hostname

Once the host activated, it went right back to the same state.

I'm open to any suggestions to get me back on track. The engine is at
3.6.7, but functioning hosts are still at 3.5.x. Should I try to upgrade
the engine and a host to 4.0.x? I had planned on having a stable 3.6 system
for a few days before trying to jump to 4.0. Or is there some way to go
back to 3.5?


Robert

-- 
Senior Software Engineer @ Parsons


pgpayyu94ifIL.pgp
Description: OpenPGP digital signature
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.5 to 3.6 upgrade stuck

2016-07-21 Thread Simone Tiraboschi
On Thu, Jul 21, 2016 at 7:14 AM, Robert Story  wrote:
> I have a 3.5 hosted-engine with 5 el7 nodes. Today I tried upgrading to 3.6.
> The engine upgrade went great, no problems.
>
> I had a host in maintenance mode, so I added the 3.6 repos and ran yum
> update. I waited for the upgrade successful message. I checked the score
> for the node, and it was still 2400, not 3400. Tried rebooting, but no
> luck. So I put another host in maintenance mode, and had the same result.

MainThread::INFO::2016-07-20
23:44:30,352::upgrade::1031::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(upgrade_35_36)
Successfully upgraded

Everything seams OK on the upgrade path.


> Both nodes are getting this error:
>
> MainThread::ERROR::2016-07-21 
> 01:05:04,187::brokerlink::279::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(_communicate)
>  Connection closed: Connection timed out
> MainThread::ERROR::2016-07-21 
> 01:05:04,188::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent)
>  Error: 'Failed to start monitor , options {'hostname': 
> 'poseidon.netsec'}: Connection timed out' - trying to restart agent

Can you please attach also broker.log? maybe the issue is somewhere else.

> I've attached logs from the second host coming up after a reboot, along
> with engine log from the same timeframe.
>
> Any suggestions on a way forward would be greatly appreciated.
>
>
> Robert
>
> --
> Senior Software Engineer @ Parsons
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.5 to 3.6 upgrade

2016-07-02 Thread gregor
Can you please add your solution!

cheers
gregor

On 16/06/16 01:14, Fernando Fuentes wrote:
> 
> I think I fix my own issue.
> 
> TIA! :D
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.5 to 3.6 upgrade

2016-06-15 Thread Fernando Fuentes

I think I fix my own issue.

TIA! :D

-- 
Fernando Fuentes
ffuen...@txweather.org
http://www.txweather.org

On Wed, Jun 15, 2016, at 04:34 PM, Fernando Fuentes wrote:
> Hello All!
> 
> I am upgrading from 3.5 to 3.6 and I am running through some issues with
> yum...
> 
> I am running Centos 6.8 x86_64
> 
> Any ideas?
> 
> Resolving Dependencies
> --> Running transaction check
> ---> Package ovirt-engine-setup.noarch 0:3.5.6.2-1.el6 will be updated
> ---> Package ovirt-engine-setup.noarch 0:3.6.6.2-1.el6 will be an update
> ---> Package ovirt-engine-setup-base.noarch 0:3.5.6.2-1.el6 will be
> updated
> ---> Package ovirt-engine-setup-base.noarch 0:3.6.6.2-1.el6 will be an
> update
> --> Processing Dependency: ovirt-engine-lib >= 3.6.6.2-1.el6 for
> package: ovirt-engine-setup-base-3.6.6.2-1.el6.noarch
> --> Processing Dependency: otopi >= 1.4.1 for package:
> ovirt-engine-setup-base-3.6.6.2-1.el6.noarch
> ---> Package ovirt-engine-setup-plugin-ovirt-engine.noarch
> 0:3.5.6.2-1.el6 will be updated
> --> Processing Dependency: ovirt-engine-setup-plugin-ovirt-engine =
> 3.5.6.2-1.el6 for package:
> ovirt-engine-setup-plugin-allinone-3.5.6.2-1.el6.noarch
> ---> Package ovirt-engine-setup-plugin-ovirt-engine.noarch
> 0:3.6.6.2-1.el6 will be an update
> --> Processing Dependency:
> ovirt-engine-setup-plugin-vmconsole-proxy-helper = 3.6.6.2-1.el6 for
> package: ovirt-engine-setup-plugin-ovirt-engine-3.6.6.2-1.el6.noarch
> --> Processing Dependency: ovirt-engine-extension-aaa-jdbc for package:
> ovirt-engine-setup-plugin-ovirt-engine-3.6.6.2-1.el6.noarch
> ---> Package ovirt-engine-setup-plugin-ovirt-engine-common.noarch
> 0:3.5.6.2-1.el6 will be updated
> ---> Package ovirt-engine-setup-plugin-ovirt-engine-common.noarch
> 0:3.6.6.2-1.el6 will be an update
> --> Processing Dependency: ovirt-setup-lib for package:
> ovirt-engine-setup-plugin-ovirt-engine-common-3.6.6.2-1.el6.noarch
> ---> Package ovirt-engine-setup-plugin-websocket-proxy.noarch
> 0:3.5.6.2-1.el6 will be updated
> ---> Package ovirt-engine-setup-plugin-websocket-proxy.noarch
> 0:3.6.6.2-1.el6 will be an update
> --> Running transaction check
> ---> Package otopi.noarch 0:1.3.2-1.el6 will be updated
> --> Processing Dependency: otopi = 1.3.2-1.el6 for package:
> otopi-java-1.3.2-1.el6.noarch
> ---> Package otopi.noarch 0:1.4.1-1.el6 will be an update
> ---> Package ovirt-engine-extension-aaa-jdbc.noarch 0:1.0.7-1.el6 will
> be installed
> ---> Package ovirt-engine-lib.noarch 0:3.5.6.2-1.el6 will be updated
> ---> Package ovirt-engine-lib.noarch 0:3.6.6.2-1.el6 will be an update
> ---> Package ovirt-engine-setup-plugin-ovirt-engine.noarch
> 0:3.5.6.2-1.el6 will be updated
> --> Processing Dependency: ovirt-engine-setup-plugin-ovirt-engine =
> 3.5.6.2-1.el6 for package:
> ovirt-engine-setup-plugin-allinone-3.5.6.2-1.el6.noarch
> ---> Package ovirt-engine-setup-plugin-vmconsole-proxy-helper.noarch
> 0:3.6.6.2-1.el6 will be installed
> ---> Package ovirt-setup-lib.noarch 0:1.0.1-1.el6 will be installed
> --> Running transaction check
> ---> Package otopi-java.noarch 0:1.3.2-1.el6 will be updated
> ---> Package otopi-java.noarch 0:1.4.1-1.el6 will be an update
> ---> Package ovirt-engine-setup-plugin-ovirt-engine.noarch
> 0:3.5.6.2-1.el6 will be updated
> --> Processing Dependency: ovirt-engine-setup-plugin-ovirt-engine =
> 3.5.6.2-1.el6 for package:
> ovirt-engine-setup-plugin-allinone-3.5.6.2-1.el6.noarch
> --> Processing Conflict: ovirt-engine-setup-base-3.6.6.2-1.el6.noarch
> conflicts ovirt-engine-reports-setup < 3.6.0
> --> Restarting Dependency Resolution with new changes.
> --> Running transaction check
> ---> Package ovirt-engine-reports-setup.noarch 0:3.5.5-2.el6 will be
> updated
> ---> Package ovirt-engine-reports-setup.noarch 0:3.6.5.1-1.el6 will be
> an update
> ---> Package ovirt-engine-setup-plugin-ovirt-engine.noarch
> 0:3.5.6.2-1.el6 will be updated
> --> Processing Dependency: ovirt-engine-setup-plugin-ovirt-engine =
> 3.5.6.2-1.el6 for package:
> ovirt-engine-setup-plugin-allinone-3.5.6.2-1.el6.noarch
> --> Processing Conflict: ovirt-engine-setup-base-3.6.6.2-1.el6.noarch
> conflicts ovirt-engine-dwh-setup < 3.6.0
> --> Restarting Dependency Resolution with new changes.
> --> Running transaction check
> ---> Package ovirt-engine-dwh-setup.noarch 0:3.5.5-1.el6 will be updated
> ---> Package ovirt-engine-dwh-setup.noarch 0:3.6.6-1.el6 will be an
> update
> ---> Package ovirt-engine-setup-plugin-ovirt-engine.noarch
> 0:3.5.6.2-1.el6 will be updated
> --> Processing Dependency: ovirt-engine-setup-plugin-ovirt-engine =
> 3.5.6.2-1.el6 for package:
> ovirt-engine-setup-plugin-allinone-3.5.6.2-1.el6.noarch
> --> Finished Dependency Resolution
> Error: Package: ovirt-engine-setup-plugin-allinone-3.5.6.2-1.el6.noarch
> (@ovirt-3.5)
>Requires: ovirt-engine-setup-plugin-ovirt-engine =
>3.5.6.2-1.el6
>Removing:
>ovirt-engine-setup-plugin-ovirt-engine-3.5.6.2-1.el6.noarch
>(@ovirt-3.5)
>