Re: [ovirt-users] First oVirt engine deploy: missing gateway on hosts
Hi Mauro, Creating distributed dispersed volumes are not supported from ovirt UI yet but you should be able to sync them if cluster is imported into the UI. same holds true for add / remove bricks on disperse and distribute disperse volumes. you wont be able to see bricks created because ovirt excepts them to be mounted at /rhgs/. What you could simply do is uncheck the check box in the 'Add brick' dialog and type in the path or you could mount your bricks at above said location and that would show all the available bricks on the host. Hope this helps. Thanks kasturi. On Sat, Sep 2, 2017 at 7:16 PM, Mauro Tridici wrote: > Hi all, > > I just started my first Ovirt Engine deploy using a dedicated (and > separated) virtual machine. > I’m trying to create and manage a test Gluster cluster using 3 “virtual” > hosts (hostnames are glu01, glu02, glu03) > 2 different networks have been defined on the hosts (192.168.213.0/24 for > management network and 192.168.152.0/24 for gluster network). > Ovirt engine deploy completed without any problem, the hosts have been > added easily using ovirtmgmt network (bridgeless mgmt network) and > ovirtgluster (bridgeless gluster network). > > Everything seems to be ok for this first deploy, but I just noticed that > the gateway is missing on the target hosts: > > [root@glu01 ~]# route > Kernel IP routing table > Destination Gateway Genmask Flags Metric RefUse > Iface > link-local 0.0.0.0 255.255.0.0 U 1002 00 > ens33 > link-local 0.0.0.0 255.255.0.0 U 1003 00 > ens34 > 192.168.152.0 0.0.0.0 255.255.255.0 U 0 00 > ens34 > 192.168.213.0 0.0.0.0 255.255.255.0 U 0 00 > ens33 > > [root@glu02 ~]# route > Kernel IP routing table > Destination Gateway Genmask Flags Metric RefUse > Iface > link-local 0.0.0.0 255.255.0.0 U 1002 00 > ens33 > link-local 0.0.0.0 255.255.0.0 U 1003 00 > ens34 > 192.168.152.0 0.0.0.0 255.255.255.0 U 0 00 > ens34 > 192.168.213.0 0.0.0.0 255.255.255.0 U 0 00 > ens33 > > [root@glu03 ~]# route > Kernel IP routing table > Destination Gateway Genmask Flags Metric RefUse > Iface > link-local 0.0.0.0 255.255.0.0 U 1002 00 > ens33 > link-local 0.0.0.0 255.255.0.0 U 1003 00 > ens34 > 192.168.152.0 0.0.0.0 255.255.255.0 U 0 00 > ens34 > 192.168.213.0 0.0.0.0 255.255.255.0 U 0 00 > ens33 > > Due to this problem I cannot reach internet from ens33 nic (management > network). > I just tried to add the gateway in ifcfg-ens33 configuration file but > gateway disappear after host reboot. > > [root@glu01 ~]# cat /etc/sysconfig/network-scripts/ifcfg-ens33 > # Generated by VDSM version 4.19.28-1.el7.centos > DEVICE=ens33 > ONBOOT=yes > IPADDR=192.168.213.151 > NETMASK=255.255.255.0 > BOOTPROTO=none > MTU=1500 > DEFROUTE=no > NM_CONTROLLED=no > IPV6INIT=yes > IPV6_AUTOCONF=yes > > The oVirt Engine network configuration is the following one: > > [host glu01] > ens33 -> ovirtmgmt (192.168.213.151, 255.255.255.0, 192.168.213.2) > ens34 -> ovirtgluster (192.168.152.151, 255.255.255.0) > > [host glu02] > ens33 -> ovirtmgmt (192.168.213.152, 255.255.255.0, 192.168.213.2) > ens34 -> ovirtgluster (192.168.152.152, 255.255.255.0) > > [host glu03] > ens33 -> ovirtmgmt (192.168.213.153, 255.255.255.0, 192.168.213.2) > ens34 -> ovirtgluster (192.168.152.153, 255.255.255.0) > > Do you know the right way to set the gateway IP on all hosts? > > Just two last questions: I was able to import an existing gluster cluster > using oVirt Engine, but I’m not able to create a new volume because: > > - I can’t select a distributed disperse volume configuration from oVirt > Engine volume creation window > - i can’t see the bricks to be used to create a new volume (but I can > import an existing volume without problem). > > Is there something that I can do to resolve the issues and complete my > first experience with oVirt? > > Thank you very much, > Mauro T. > > > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > > ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] hyperconverged question
Hi charles, The right option is backup-volfile-servers and not 'backupvolfile-server'. So can you please use the first one and test ? Thanks kasturi On Sat, Sep 2, 2017 at 5:23 AM, Charles Kozler wrote: > Jim - > > result of this test...engine crashed but all VM's on the gluster domain > (backed by the same physical nodes/hardware/gluster process/etc) stayed up > fine > > I guess there is some functional difference between 'backupvolfile-server' > and 'backup-volfile-servers'? > > Perhaps try latter and see what happens. My next test is going to be to > configure hosted-engine.conf with backupvolfile-server=node2:node3 and > see if engine VM still shuts down. Seems odd engine VM would shut itself > down (or vdsm would shut it down) but not other VMs. Perhaps built in HA > functionality of sorts > > On Fri, Sep 1, 2017 at 7:38 PM, Charles Kozler > wrote: > >> Jim - >> >> One thing I noticed is that, by accident, I used >> 'backupvolfile-server=node2:node3' which is apparently a supported >> setting. It would appear, by reading the man page of mount.glusterfs, the >> syntax is slightly different. not sure if my setting being different has >> different impacts >> >> hosted-engine.conf: >> >> # cat /etc/ovirt-hosted-engine/hosted-engine.conf | grep -i option >> mnt_options=backup-volfile-servers=node2:node3 >> >> And for my datatest gluster domain I have: >> >> backupvolfile-server=node2:node3 >> >> I am now curious what happens when I move everything to node1 and drop >> node2 >> >> To that end, will follow up with that test >> >> >> >> >> On Fri, Sep 1, 2017 at 7:20 PM, Charles Kozler >> wrote: >> >>> Jim - >>> >>> here is my test: >>> >>> - All VM's on node2: hosted engine and 1 test VM >>> - Test VM on gluster storage domain (with mount options set) >>> - hosted engine is on gluster as well, with settings persisted to >>> hosted-engine.conf for backupvol >>> >>> All VM's stayed up. Nothing in dmesg of the test vm indicating a pause >>> or an issue or anything >>> >>> However, what I did notice during this, is my /datatest volume doesnt >>> have quorum set. So I will set that now and report back what happens >>> >>> # gluster volume info datatest >>> >>> Volume Name: datatest >>> Type: Replicate >>> Volume ID: 229c25f9-405e-4fe7-b008-1d3aea065069 >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 1 x 3 = 3 >>> Transport-type: tcp >>> Bricks: >>> Brick1: node1:/gluster/data/datatest/brick1 >>> Brick2: node2:/gluster/data/datatest/brick1 >>> Brick3: node3:/gluster/data/datatest/brick1 >>> Options Reconfigured: >>> transport.address-family: inet >>> nfs.disable: on >>> >>> Perhaps quorum may be more trouble than its worth when you have 3 nodes >>> and/or 2 nodes + arbiter? >>> >>> Since I am keeping my 3rd node out of ovirt, I am more content on >>> keeping it as a warm spare if I **had** to swap it in to ovirt cluster, but >>> keeps my storage 100% quorum >>> >>> On Fri, Sep 1, 2017 at 5:18 PM, Jim Kusznir wrote: >>> I can confirm that I did set it up manually, and I did specify backupvol, and in the "manage domain" storage settings, I do have under mount options, backup-volfile-servers=192.168.8.12:192.168.8.13 (and this was done at initial install time). The "used managed gluster" checkbox is NOT checked, and if I check it and save settings, next time I go in it is not checked. --Jim On Fri, Sep 1, 2017 at 2:08 PM, Charles Kozler wrote: > @ Jim - here is my setup which I will test in a few (brand new > cluster) and report back what I found in my tests > > - 3x servers direct connected via 10Gb > - 2 of those 3 setup in ovirt as hosts > - Hosted engine > - Gluster replica 3 (no arbiter) for all volumes > - 1x engine volume gluster replica 3 manually configured (not using > ovirt managed gluster) > - 1x datatest volume (20gb) replica 3 manually configured (not using > ovirt managed gluster) > - 1x nfstest domain served from some other server in my infrastructure > which, at the time of my original testing, was master domain > > I tested this earlier and all VMs stayed online. However, ovirt > cluster reported DC/cluster down, all VM's stayed up > > As I am now typing this, can you confirm you setup your gluster > storage domain with backupvol? Also, confirm you updated > hosted-engine.conf > with backupvol mount option as well? > > On Fri, Sep 1, 2017 at 4:22 PM, Jim Kusznir > wrote: > >> So, after reading the first document twice and the 2nd link >> thoroughly once, I believe that the arbitrator volume should be >> sufficient >> and count for replica / split brain. EG, if any one full replica is >> down, >> and the arbitrator and the other replica is up, then it should have >> quorum >> and all should be good. >> >> I think my underlying problem has to do more with conf
Re: [ovirt-users] hyperconverged question
Hi Jim, I looked at the gluster volume info and that looks to be fine for me. Recommended config is arbiter for data and vmstore and for engine it should be replica 3 since we would want HE to be available always. If i understand right the problem you are facing is when you shut down one of the node all the HE vms and app vms goes to paused state right ? For debugging further and to ensure that volume has been mounted using backup-volfile-servers option, you can move the storage domain to maintenance which will umount the volume , activate back which will mount it again. During this time you can check the mount command passed in vdsm logs and that should have the backup-volfile-servers option. Can you please confirm if you have ovirt-guest-agent installed on the app vms and power management enabled ? ovirt-guest-agent is required on the app vms to ensure HA functionality Thanks kasturi On Sat, Sep 2, 2017 at 2:48 AM, Jim Kusznir wrote: > I can confirm that I did set it up manually, and I did specify backupvol, > and in the "manage domain" storage settings, I do have under mount > options, backup-volfile-servers=192.168.8.12:192.168.8.13 (and this was > done at initial install time). > > The "used managed gluster" checkbox is NOT checked, and if I check it and > save settings, next time I go in it is not checked. > > --Jim > > On Fri, Sep 1, 2017 at 2:08 PM, Charles Kozler > wrote: > >> @ Jim - here is my setup which I will test in a few (brand new cluster) >> and report back what I found in my tests >> >> - 3x servers direct connected via 10Gb >> - 2 of those 3 setup in ovirt as hosts >> - Hosted engine >> - Gluster replica 3 (no arbiter) for all volumes >> - 1x engine volume gluster replica 3 manually configured (not using ovirt >> managed gluster) >> - 1x datatest volume (20gb) replica 3 manually configured (not using >> ovirt managed gluster) >> - 1x nfstest domain served from some other server in my infrastructure >> which, at the time of my original testing, was master domain >> >> I tested this earlier and all VMs stayed online. However, ovirt cluster >> reported DC/cluster down, all VM's stayed up >> >> As I am now typing this, can you confirm you setup your gluster storage >> domain with backupvol? Also, confirm you updated hosted-engine.conf with >> backupvol mount option as well? >> >> On Fri, Sep 1, 2017 at 4:22 PM, Jim Kusznir wrote: >> >>> So, after reading the first document twice and the 2nd link thoroughly >>> once, I believe that the arbitrator volume should be sufficient and count >>> for replica / split brain. EG, if any one full replica is down, and the >>> arbitrator and the other replica is up, then it should have quorum and all >>> should be good. >>> >>> I think my underlying problem has to do more with config than the >>> replica state. That said, I did size the drive on my 3rd node planning to >>> have an identical copy of all data on it, so I'm still not opposed to >>> making it a full replica. >>> >>> Did I miss something here? >>> >>> Thanks! >>> >>> On Fri, Sep 1, 2017 at 11:59 AM, Charles Kozler >>> wrote: >>> These can get a little confusing but this explains it best: https://gluster.readthedocs.io/en/latest/Administrator %20Guide/arbiter-volumes-and-quorum/#replica-2-and-replica-3-volumes Basically in the first paragraph they are explaining why you cant have HA with quorum for 2 nodes. Here is another overview doc that explains some more http://openmymind.net/Does-My-Replica-Set-Need-An-Arbiter/ From my understanding arbiter is good for resolving split brains. Quorum and arbiter are two different things though quorum is a mechanism to help you **avoid** split brain and the arbiter is to help gluster resolve split brain by voting and other internal mechanics (as outlined in link 1). How did you create the volume exactly - what command? It looks to me like you created it with 'gluster volume create replica 2 arbiter 1 {}' per your earlier mention of "replica 2 arbiter 1". That being said, if you did that and then setup quorum in the volume configuration, this would cause your gluster to halt up since quorum was lost (as you saw until you recovered node 1) As you can see from the docs, there is still a corner case for getting in to split brain with replica 3, which again, is where arbiter would help gluster resolve it I need to amend my previous statement: I was told that arbiter volume does not store data, only metadata. I cannot find anything in the docs backing this up however it would make sense for it to be. That being said, in my setup, I would not include my arbiter or my third node in my ovirt VM cluster component. I would keep it completely separate On Fri, Sep 1, 2017 at 2:46 PM, Jim Kusznir wrote: > I'm now also confused as to what the point of an arb
Re: [ovirt-users] ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [] Unable to process messages General SSLEngine problem
Gary, Looking at your engine log I see this: Unable to process messages General SSLEngine problem. It means that you have an issue with establishing secure connection. In order to understand more details about your failure please set log level to debug by doing [1]. Once you enable it please provide more information why engine fails to talk to vdsm. Thanks, Piotr [1] http://www.ovirt.org/develop/developer-guide/engine/engine-development-environment/#enable-debug-log---restart-required On Fri, Sep 1, 2017 at 10:40 PM, Gary Balliet wrote: > Good day all. > > Just playing with ovirt. New to it but seems quite good. > > Single instance/nfs share/centos7/ovirt 4.1 > > > > Had a power outage and this error message is in my logs whilst trying to > activate a downed host. The snippet below is from engine.log. > > 2017-09-01 13:32:03,092-07 INFO > [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) > [] Connecting to /192.168.1.147 > 2017-09-01 13:32:03,097-07 ERROR > [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [] > Unable to process messages General SSLEngine problem > 2017-09-01 13:32:04,547-07 ERROR > [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] > (DefaultQuartzScheduler5) [77a871f9-4947-46c9-977f-db5f76cac358] Command > 'GetAllVmStatsVDSCommand(HostName = DellServer, > VdsIdVDSCommandParametersBase:{runAsync='true', > hostId='b8ceb86f-c4e1-4bbd-afad-5044ebe9eddd'})' execution failed: > VDSGenericException: VDSNetworkException: General SSLEngine problem > 2017-09-01 13:32:04,547-07 INFO > [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher] > (DefaultQuartzScheduler5) [77a871f9-4947-46c9-977f-db5f76cac358] Failed to > fetch vms info for host 'DellServer' - skipping VMs monitoring. > 2017-09-01 13:32:19,548-07 INFO > [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) > [] Connecting to /192.168.1.147 > 2017-09-01 13:32:19,552-07 ERROR > [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [] > Unable to process messages General SSLEngine problem > 2017-09-01 13:32:23,115-07 ERROR > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > (DefaultQuartzScheduler4) [77a871f9-4947-46c9-977f-db5f76cac358] EVENT_ID: > VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack: null, > Custom Event ID: -1, Message: VDSM DellServer command GetCapabilitiesVDS > failed: General SSLEngine problem > 2017-09-01 13:32:23,115-07 INFO > [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] > (DefaultQuartzScheduler4) [77a871f9-4947-46c9-977f-db5f76cac358] Command > 'org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand' return > value 'org.ovirt.engine.core.vdsbroker.vdsbroker.VDSInfoReturn@65b16430' > 2017-09-01 13:32:23,115-07 INFO > [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] > (DefaultQuartzScheduler4) [77a871f9-4947-46c9-977f-db5f76cac358] HostName = > DellServer > 2017-09-01 13:32:23,116-07 ERROR > [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] > (DefaultQuartzScheduler4) [77a871f9-4947-46c9-977f-db5f76cac358] Command > 'GetCapabilitiesVDSCommand(HostName = DellServer, > VdsIdAndVdsVDSCommandParametersBase:{runAsync='true', > hostId='b8ceb86f-c4e1-4bbd-afad-5044ebe9eddd', > vds='Host[DellServer,b8ceb86f-c4e1-4bbd-afad-5044ebe9eddd]'})' execution > failed: VDSGenericException: VDSNetworkException: General SSLEngine problem > 2017-09-01 13:32:23,116-07 ERROR > [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] > (DefaultQuartzScheduler4) [77a871f9-4947-46c9-977f-db5f76cac358] Failure to > refresh host 'DellServer' runtime info: VDSGenericException: > VDSNetworkException: General SSLEngine problem > 2017-09-01 13:32:26,118-07 INFO > [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) > [] Connecting to /192.168.1.147 > 2017-09-01 13:32:26,122-07 ERROR > [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) [] > Unable to process messages General SSLEngine problem > 2017-09-01 13:32:39,550-07 ERROR > [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] > (DefaultQuartzScheduler1) [77a871f9-4947-46c9-977f-db5f76cac358] Command > 'GetAllVmStatsVDSCommand(HostName = DellServer, > VdsIdVDSCommandParametersBase:{runAsync='true', > hostId='b8ceb86f-c4e1-4bbd-afad-5044ebe9eddd'})' execution failed: > VDSGenericException: VDSNetworkException: General SSLEngine problem > 2017-09-01 13:32:39,551-07 INFO > [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher] > (DefaultQuartzScheduler1) [77a871f9-4947-46c9-977f-db5f76cac358] Failed to > fetch vms info for host 'DellServer' - skipping VMs monitoring. > 2017-09-01 13:32:46,125-07 ERROR > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > (DefaultQuartzScheduler7) [77a871f9-4947-46c9-977f-db5f76cac358] EVENT_ID: > VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID
Re: [ovirt-users] failed upgrade oVirt node 4.1.3 -> 4.1.5
Hi, Seems to be a bug that was resolved here https://gerrit.ovirt.org/c/80716/ Thanks, Yuval. On Fri, Sep 1, 2017 at 3:55 PM, Matthias Leopold < matthias.leop...@meduniwien.ac.at> wrote: > hi, > > i'm sorry to write to this list again, but i failed to upgrade a freshly > installed oVirt Node from version 4.1.3 to 4.1.5. it seems to be a SELinux > related problem. i'm attaching imgbased.log + relevant lines from > engine.log. > > is the skipped version (4.1.4) the problem? > can i force upgrade to version 4.1.4? > > thx > matthias > > > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > > ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] Install failed when adding host in Ovirt
On Fri, Sep 1, 2017 at 12:08 PM, Khoi Thinh wrote: > Hi everyone, > I have a question regard of host in Ovirt. Is it possible that we can add > host which is registered in different data-center? > No. What exactly is the use case? A host can be in one data-center under one Engine, one cluster, etc. Y. > > > -- > *Khoi Thinh* > > ___ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users > > ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[ovirt-users] Cannot delete the snapshot
Dear all, This is my first time to ask question in this place. Thank you for time to read my question first. Yesterday, i wanted to delete the snapshot on oVirt Engine Web Administration. However, it took so long to delete a snapshot and it still in process. Due to the "snapshot error", i cannot run the virtual machine as i shut it down before. Is there anyone has some idea how to fix it? The following information maybe useful for debugging . On web administration: Removing Snapshot Sep 1, 2017 6:06:58 PM N/A Validating Sep 1, 2017 6:06:58 PM until Sep 1, 2017 6:06:58 PM Executing Sep 1, 2017 6:06:58 PM N/A On Engine log. 2017-09-02 20:57:10,700+08 INFO [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (DefaultQuartzScheduler6) [70471886-a5cb-404c-80db-b8469092aa8e] Command 'RemoveSnapshot' (id: '78ee1e62-26ea-403d-b518-9e3d3be995c2') waiting on child command id: '5f2e8197-704f-453c-bea8-74186e5ca95c' type:'RemoveSnapshotSingleDiskLive' to complete ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users