On Fri, Feb 3, 2017 at 10:54 AM, Ralf Schenk <r...@databay.de> wrote:

> Hello,
>
> I upgraded my cluster of 8 hosts with gluster storage and
> hosted-engine-ha. They were already Centos 7.3 and using Ovirt 4.0.6 and
> gluster 3.7.x packages from storage-sig testing.
>
> I'm missing the storage listed under storage tab but this is already filed
> by a bug. Increasing Cluster and Storage Compability level and also "reset
> emulated machine" after having upgraded one host after another without the
> need to shutdown vm's works well. (VM's get sign that there will be changes
> after reboot).
>
> Important: you also have to issue a yum update on the host for upgrading
> additional components like i.e. gluster to 3.8.x. I was frightened of this
> step but It worked well except a configuration issue I was responsible for
> in gluster.vol (I had "transport socket, rdma")
>
> Bugs/Quirks so far:
>
> 1. After restarting a single VM that used RNG-Device I got an error (it
> was german) but like "RNG Device not supported by cluster". I hat to
> disable RNG Device save the settings. Again settings and enable RNG Device.
> Then machine boots up.
> I think there is a migration step missing from /dev/random to /dev/urandom
> for exisiting VM's.
>

Tomas, Francesco, Michal, can you please follow up on this?



> 2. I'm missing any gluster specific management features as my gluster is
> not managable in any way from the GUI. I expected to see my gluster now in
> dashboard and be able to add volumes etc. What do I need to do to "import"
> my existing gluster (Only one volume so far) to be managable ?
>

Sahina, can you please follow up on this?


> 3. Three of my hosts have the hosted engine deployed for ha. First all
> three where marked by a crown (running was gold and others where silver).
> After upgrading the 3 Host deployed hosted engine ha is not active anymore.
>
> I can't get this host back with working ovirt-ha-agent/broker. I already
> rebooted, manually restarted the services but It isn't able to get cluster
> state according to
> "hosted-engine --vm-status". The other hosts state the host status as
> "unknown stale-data"
>
> I already shut down all agents on all hosts and issued a "hosted-engine
> --reinitialize-lockspace" but that didn't help.
>
> Agents stops working after a timeout-error according to log:
>
> MainThread::INFO::2017-02-02 19:24:52,040::hosted_engine::
> 841::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status:
> PENDING
> MainThread::INFO::2017-02-02 19:24:59,185::hosted_engine::
> 841::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status:
> PENDING
> MainThread::INFO::2017-02-02 19:25:06,333::hosted_engine::
> 841::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status:
> PENDING
> MainThread::INFO::2017-02-02 19:25:13,554::hosted_engine::
> 841::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status:
> PENDING
> MainThread::INFO::2017-02-02 19:25:20,710::hosted_engine::
> 841::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status:
> PENDING
> MainThread::INFO::2017-02-02 19:25:27,865::hosted_engine::
> 841::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status:
> PENDING
> MainThread::ERROR::2017-02-02 19:25:27,866::hosted_engine::
> 815::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_initialize_domain_monitor) Failed to start monitoring
> domain (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout
> during domain acquisition
> MainThread::WARNING::2017-02-02 19:25:27,866::hosted_engine::
> 469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Error while monitoring engine: Failed to start monitoring domain
> (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96, host_id=3): timeout during
> domain acquisition
> MainThread::WARNING::2017-02-02 19:25:27,866::hosted_engine::
> 472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Unexpected error
> Traceback (most recent call last):
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 443, in start_monitoring
>     self._initialize_domain_monitor()
>   File 
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 816, in _initialize_domain_monitor
>     raise Exception(msg)
> Exception: Failed to start monitoring domain 
> (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96,
> host_id=3): timeout during domain acquisition
> MainThread::ERROR::2017-02-02 19:25:27,866::hosted_engine::
> 485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
> Shutting down the agent because of 3 failures in a row!
> MainThread::INFO::2017-02-02 19:25:32,087::hosted_engine::
> 841::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_get_domain_monitor_status) VDSM domain monitor status:
> PENDING
> MainThread::INFO::2017-02-02 19:25:34,250::hosted_engine::
> 769::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_stop_domain_monitor) Failed to stop monitoring domain
> (sd_uuid=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96): Storage domain is member
> of pool: u'domain=7c8deaa8-be02-4aaf-b9b4-ddc8da99ad96'
> MainThread::INFO::2017-02-02 19:25:34,254::agent::143::
> ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent shutting down
>
Simone, Martin, can you please follow up on this?




>
>
> The gluster volume of the engine is mounted corrctly in the host and
> accessible. Files are also readable etc. No clue what to do.
>
> 4. Last but not least: Ovirt is still using fuse to access VM-Disks on
> Gluster. I know - scheduled for 4.1.1 - but it was already there in 3.5.x
> and was scheduled for every release since then. I had this feature with
> opennebula already two years ago and performance is sooo much better.... So
> please GET IT IN  !
>
We're aware of the performance increase, storage and gluster teams are
working on it. Maybe Sahina or Allon may follow up with current status of
the feature.




> Bye
>
>
>
> Am 02.02.2017 um 13:19 schrieb Sandro Bonazzola:
>
> Hi,
> did you install/update to 4.1.0? Let us know your experience!
> We end up knowing only when things doesn't work well, let us know it works
> fine for you :-)
>
>
> --
>
>
> *Ralf Schenk*
> fon +49 (0) 24 05 / 40 83 70 <+49%202405%20408370>
> fax +49 (0) 24 05 / 40 83 759 <+49%202405%204083759>
> mail *r...@databay.de* <r...@databay.de>
>
> *Databay AG*
> Jens-Otto-Krag-Straße 11
> D-52146 Würselen
> *www.databay.de* <http://www.databay.de>
>
> Sitz/Amtsgericht Aachen • HRB:8437 • USt-IdNr.: DE 210844202
> Vorstand: Ralf Schenk, Dipl.-Ing. Jens Conze, Aresch Yavari, Dipl.-Kfm.
> Philipp Hermanns
> Aufsichtsratsvorsitzender: Wilhelm Dohmen
> ------------------------------
>
> _______________________________________________
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>


-- 
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to