[ovirt-users] Re: hyperconverged single node with SSD cache fails gluster creation
On Wed, Sep 4, 2019 at 9:27 PM wrote: > I am seeing more success than failures at creating single and triple node > hyperconverged setups after some weeks of experimentation so I am branching > out to additional features: In this case the ability to use SSDs as cache > media for hard disks. > > I tried first with a single node that combined caching and compression and > that fails during the creation of LVMs. > > I tried again without the VDO compression, but actually the results where > identical whilst VDO compression but without the LV cache worked ok. > > I tried various combinations, using less space etc., but the results are > always the same and unfortunately rather cryptic (substituted the physical > disk label with {disklabel}): > > TASK [gluster.infra/roles/backend_setup : Extend volume group] > * > failed: [{hostname}] (item={u'vgname': u'gluster_vg_{disklabel}p1', > u'cachethinpoolname': u'gluster_thinpool_gluster_vg_{disklabel}p1', > u'cachelvname': u'cachelv_gluster_thinpool_gluster_vg_{disklabel}p1', > u'cachedisk': u'/dev/sda4', u'cachemetalvname': > u'cache_gluster_thinpool_gluster_vg_{disklabel}p1', u'cachemode': > u'writeback', u'cachemetalvsize': u'70G', u'cachelvsize': u'630G'}) => > {"ansible_loop_var": "item", "changed": false, "err": " Physical volume > \"/dev/mapper/vdo_{disklabel}p1\" still in use\n", "item": {"cachedisk": > "/dev/sda4", "cachelvname": > "cachelv_gluster_thinpool_gluster_vg_{disklabel}p1", "cachelvsize": "630G", > "cachemetalvname": "cache_gluster_thinpool_gluster_vg_{disklabel}p1", > "cachemetalvsize": "70G", "cachemode": "writeback", "cachethinpoolname": > "gluster_thinpool_gluster_vg_{disklabel}p1", "vgname": > "gluster_vg_{disklabel}p1"}, "msg": "Unable to reduce > gluster_vg_{disklabel}p1 by /dev/dm-15.", "rc": 5} > > somewhere within that I see something that points to a race condition > ("still in use"). > > Unfortunately I have not been able to pinpoint the raw logs which are used > at that stage and I wasn't able to obtain more info. > > At this point quite a bit of storage setup is already done, so rolling > back for a clean new attempt, can be a bit complicated, with reboots to > reconcile the kernel with data on disk. > > I don't actually believe it's related to single node and I'd be quite > happy to move the creation of the SSD cache to a later stage, but in a VDO > setup, this looks slightly complex to someone without intimate knowledge of > LVS-with-cache-and-perhaps-thin/VDO/Gluster all thrown into one. > > Needless the feature set (SSD caching & compressed-dedup) sounds terribly > attractive but when things don't just work, it's more terrifying. > Hi Thomas, The way we have to write the variables for 2.8 while setting up cache. Currently we are writing something like this: gluster_infra_cache_vars: - vgname: vg_sdb2 cachedisk: /dev/sdb3 cachelvname: cachelv_thinpool_vg_sdb2 cachethinpoolname: thinpool_vg_sdb2 cachelvsize: '10G' cachemetalvsize: '2G' cachemetalvname: cache_thinpool_vg_sdb2 cachemode: writethrough === Not that cachedisk is provided as /dev/sdb3 which would be extended with vg vg_sdb2 ... this works well The module will take care of extending the vg with /dev/sdb3. *However with Ansible-2.8 we cannot provide like this but have to be more explicit. And have to mention the pv underlying* *this volume group vg_sdb2. So, with respect to 2.8 we have to write that variable like:* >>> gluster_infra_cache_vars: - vgname: vg_sdb2 cachedisk: '/dev/sdb2,/dev/sdb3' cachelvname: cachelv_thinpool_vg_sdb2 cachethinpoolname: thinpool_vg_sdb2 cachelvsize: '10G' cachemetalvsize: '2G' cachemetalvname: cache_thinpool_vg_sdb2 cachemode: writethrough = Note that I have mentioned both /dev/sdb2 and /dev/sdb3. This change is backward compatible, that is it works with 2.7 as well. I have raised an issue with Ansible as well. Which can be found here: https://github.com/ansible/ansible/issues/56501 However, @olafbuitelaar has fixed this in gluster-ansible-infra, and the patch is merged in master. If you can checkout master branch, you should be fine. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/WU23D3OS4TLTX3R4FYRJC4NA6HRGF4C7/
[ovirt-users] Re: Disk latency is very high, time taken to copy 1M file is > 10s
Hi, On Wed, Jul 3, 2019 at 6:22 PM Sachidananda URS wrote: > Hi, > > > On Wed, Jul 3, 2019 at 3:59 PM PS Kazi wrote: > >> hi, >> I am using HDD : Toshiba 7200 RPM, Data transfer Rate 150MB/s, Interface >> 6Gb/s. >> But Hyper-converged configuration stopped with error msg: Disk latency >> is very high, time taken to copy 1M file is > 10s >> Please help me to stop this error. >> >> >> TASK [gluster.features/roles/gluster_hci : Check if time taken to copy 1M >> file (512B chunks) is < 10s] *** >> failed: [ov-node-1.hci.com -> ov-node-1.hci.com] (item=11.9508) => >> {"ansible_loop_var": "item", "changed": false, "item": {"ansible_loop_var": >> "item", "changed": true, "cmd": "dd if=/dev/zero >> of=/mnt/tmp/engine/small.file bs=512 count=2050 oflag=direct 2>&1 |\n awk >> '/copied/{print $(NF-3)}'\n", "delta": "0:00:11.966178", "end": "2019-07-03 >> 15:28:00.162101", "failed": false, "invocation": {"module_args": >> {"_raw_params": "dd if=/dev/zero of=/mnt/tmp/engine/small.file bs=512 >> count=2050 oflag=direct 2>&1 |\n awk '/copied/{print $(NF-3)}'\n", >> "_uses_shell": true, "argv": null, "chdir": null, "creates": null, >> "executable": null, "removes": null, "stdin": null, "stdin_add_newline": >> true, "strip_empty_ends": true, "warn": true}}, "item": "/mnt/tmp/engine", >> "rc": 0, "start": "2019-07-03 15:27:48.195923", "stderr": "", >> "stderr_lines": [], "stdout": "11.9508", "stdout_lines": ["11.9508"]}, >> "msg": "Disk latency is very high, time taken to copy 1M file is > 10s"} >> > > Firstly solution. Please skip this test with --skip-tags latencycheck > For example: > ansible-playbook -i --skip-tags latencycheck > > This check was added to ensure that disklatency would be tested. > > I have removed this check in this PR: > https://github.com/gluster/gluster-ansible-features/pull/33/commits/3e52de0c2c548a7890302a9838e224cef6699586 > Will make it available once it is reviewed and merged. > I have done a new build. Please update the gluster-ansible-features rpm from: https://copr.fedorainfracloud.org/coprs/sac/gluster-ansible/build/956485/ and please try? -sac ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/MBXGOQR7WMW7Z5MASZVVLJZ3JXKYIUWJ/
[ovirt-users] Re: Disk latency is very high, time taken to copy 1M file is > 10s
Hi, On Wed, Jul 3, 2019 at 3:59 PM PS Kazi wrote: > hi, > I am using HDD : Toshiba 7200 RPM, Data transfer Rate 150MB/s, Interface > 6Gb/s. > But Hyper-converged configuration stopped with error msg: Disk latency is > very high, time taken to copy 1M file is > 10s > Please help me to stop this error. > > > TASK [gluster.features/roles/gluster_hci : Check if time taken to copy 1M > file (512B chunks) is < 10s] *** > failed: [ov-node-1.hci.com -> ov-node-1.hci.com] (item=11.9508) => > {"ansible_loop_var": "item", "changed": false, "item": {"ansible_loop_var": > "item", "changed": true, "cmd": "dd if=/dev/zero > of=/mnt/tmp/engine/small.file bs=512 count=2050 oflag=direct 2>&1 |\n awk > '/copied/{print $(NF-3)}'\n", "delta": "0:00:11.966178", "end": "2019-07-03 > 15:28:00.162101", "failed": false, "invocation": {"module_args": > {"_raw_params": "dd if=/dev/zero of=/mnt/tmp/engine/small.file bs=512 > count=2050 oflag=direct 2>&1 |\n awk '/copied/{print $(NF-3)}'\n", > "_uses_shell": true, "argv": null, "chdir": null, "creates": null, > "executable": null, "removes": null, "stdin": null, "stdin_add_newline": > true, "strip_empty_ends": true, "warn": true}}, "item": "/mnt/tmp/engine", > "rc": 0, "start": "2019-07-03 15:27:48.195923", "stderr": "", > "stderr_lines": [], "stdout": "11.9508", "stdout_lines": ["11.9508"]}, > "msg": "Disk latency is very high, time taken to copy 1M file is > 10s"} > Firstly solution. Please skip this test with --skip-tags latencycheck For example: ansible-playbook -i --skip-tags latencycheck This check was added to ensure that disklatency would be tested. I have removed this check in this PR: https://github.com/gluster/gluster-ansible-features/pull/33/commits/3e52de0c2c548a7890302a9838e224cef6699586 Will make it available once it is reviewed and merged. -sac ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/ITGGMKMTH556VD5VQ4T3YPJQKHFULOOY/
[ovirt-users] Re: 4.3.4 caching disk error during hyperconverged deployment
On Thu, Jun 13, 2019 at 7:11 AM wrote: > While trying to do a hyperconverged setup and trying to use "configure LV > Cache" /dev/sdf the deployment fails. If I dont use the LV cache SSD Disk > the setup succeds, thought you mighg want to know, for now I retested with > 4.3.3 and all worked fine, so reverting to 4.3.3 unless you know of a > workaround? > > Error: > TASK [gluster.infra/roles/backend_setup : Extend volume group] > * > failed: [vmm11.mydomain.com] (item={u'vgname': u'gluster_vg_sdb', > u'cachethinpoolname': u'gluster_thinpool_gluster_vg_sdb', u'cachelvname': > u'cachelv_gluster_thinpool_gluster_vg_sdb', u'cachedisk': u'/dev/sdf', > u'cachemetalvname': u'cache_gluster_thinpool_gluster_vg_sdb', u'cachemode': > u'writethrough', u'cachemetalvsize': u'0.1G', u'cachelvsize': u'0.9G'}) => > {"ansible_loop_var": "item", "changed": false, "err": " Physical volume > \"/dev/sdb\" still in use\n", "item": {"cachedisk": "/dev/sdf", > "cachelvname": "cachelv_gluster_thinpool_gluster_vg_sdb", "cachelvsize": > "0.9G", "cachemetalvname": "cache_gluster_thinpool_gluster_vg_sdb", > "cachemetalvsize": "0.1G", "cachemode": "writethrough", > "cachethinpoolname": "gluster_thinpool_gluster_vg_sdb", "vgname": The variable file does not seem to be right. You have mentioned cachethinpoolname: gluster_thinpool_gluster_vg_sdb but you are not creating it anywhere. So, the Ansible module is trying to shrink the volume group. Also why is cachelvsize is 0.9G and chachemetasize 0.1G? Isn't it too less? Please refer: https://github.com/gluster/gluster-ansible/blob/master/playbooks/hc-ansible-deployment/gluster_inventory.yml for example. -sac > > "gluster_vg_sdb"}, "msg": "Unable to reduce gluster_vg_sdb by /dev/sdb.", > "rc": 5} > > failed: [vmm12.mydomain.com] (item={u'vgname': u'gluster_vg_sdb', > u'cachethinpoolname': u'gluster_thinpool_gluster_vg_sdb', u'cachelvname': > u'cachelv_gluster_thinpool_gluster_vg_sdb', u'cachedisk': u'/dev/sdf', > u'cachemetalvname': u'cache_gluster_thinpool_gluster_vg_sdb', u'cachemode': > u'writethrough', u'cachemetalvsize': u'0.1G', u'cachelvsize': u'0.9G'}) => > {"ansible_loop_var": "item", "changed": false, "err": " Physical volume > \"/dev/sdb\" still in use\n", "item": {"cachedisk": "/dev/sdf", > "cachelvname": "cachelv_gluster_thinpool_gluster_vg_sdb", "cachelvsize": > "0.9G", "cachemetalvname": "cache_gluster_thinpool_gluster_vg_sdb", > "cachemetalvsize": "0.1G", "cachemode": "writethrough", > "cachethinpoolname": "gluster_thinpool_gluster_vg_sdb", "vgname": > "gluster_vg_sdb"}, "msg": "Unable to reduce gluster_vg_sdb by /dev/sdb.", > "rc": 5} > > failed: [vmm10.mydomain.com] (item={u'vgname': u'gluster_vg_sdb', > u'cachethinpoolname': u'gluster_thinpool_gluster_vg_sdb', u'cachelvname': > u'cachelv_gluster_thinpool_gluster_vg_sdb', u'cachedisk': u'/dev/sdf', > u'cachemetalvname': u'cache_gluster_thinpool_gluster_vg_sdb', u'cachemode': > u'writethrough', u'cachemetalvsize': u'30G', u'cachelvsize': u'270G'}) => > {"ansible_loop_var": "item", "changed": false, "err": " Physical volume > \"/dev/sdb\" still in use\n", "item": {"cachedisk": "/dev/sdf", > "cachelvname": "cachelv_gluster_thinpool_gluster_vg_sdb", "cachelvsize": > "270G", "cachemetalvname": "cache_gluster_thinpool_gluster_vg_sdb", > "cachemetalvsize": "30G", "cachemode": "writethrough", "cachethinpoolname": > "gluster_thinpool_gluster_vg_sdb", "vgname": "gluster_vg_sdb"}, "msg": > "Unable to reduce gluster_vg_sdb by /dev/sdb.", "rc": 5} > > PLAY RECAP > * > vmm10.mydomain.com : ok=13 changed=4unreachable=0 > failed=1skipped=10 rescued=0ignored=0 > vmm11.mydomain.com : ok=13 changed=4unreachable=0 > failed=1skipped=10 rescued=0ignored=0 > vmm12.mydomain.com : ok=13 changed=4unreachable=0 > failed=1skipped=10 rescued=0ignored=0 > > > > > - > #cat /etc/ansible/hc_wizard_inventory.yml > > - > hc_nodes: > hosts: > vmm10.mydomain.com: > gluster_infra_volume_groups: > - vgname: gluster_vg_sdb > pvname: /dev/sdb > - vgname: gluster_vg_sdc > pvname: /dev/sdc > - vgname: gluster_vg_sdd > pvname: /dev/sdd > - vgname: gluster_vg_sde > pvname: /dev/sde > gluster_infra_mount_devices: > - path: /gluster_bricks/engine > lvname: gluster_lv_engine > vgname: gluster_vg_sdb > - path: /gluster_bricks/vmstore1 > lvname: gluster_lv_vmstore1 > vgname: gluster_vg_sdc > - path: /gluster_bricks/data1 > lvname: gluster_lv_data1 > vgname: gluster_vg_sdd > - path: /gluster_bricks/dat
[ovirt-users] Re: Gluster Deployment Failed - No Medium Found
Hi Stephen, On Mon, Jun 3, 2019 at 3:57 PM wrote: > Good Morning, > > I'm completely new to this and I'm testing setting up a Gluster > environment with Ovirt. However, my deployment keeps fails and I don't > understand what it means. Any assistance would be much appreciated. Please > see error below... > > Error Message > > TASK [gluster.infra/roles/backend_setup : Create volume groups] > > failed: [ov1.test1.lan] (item={u'vgname': u'gluster_vg_sdb', u'pvname': > u'/dev/sdb'}) => {"changed": false, "err": " /dev/sdb: open failed: No > medium found\n Device /dev/sdb excluded by a filter.\n", "item": > {"pvname": "/dev/sdb", "vgname": "gluster_vg_sdb"}, "msg": "Creating > physical volume '/dev/sdb' failed", "rc": 5} > One of the reasons for this is, the device could be having an existing partition table information. Can you please run the command: wipefs -a /dev/sdb On all the nodes and try again? -sachi ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/C7QG5ZBMTETDS7DTHQIO54XHZSVJLGDT/
[ovirt-users] Re: oVirt node loses gluster volume UUID after reboot, goes to emergency mode every time I reboot.
On Mon, May 27, 2019 at 9:41 AM wrote: > I made them manually. First created the LVM drives, then the VDO devices, > then gluster volumes > In that case you must add these mount options ( inode64,noatime,nodiratime,_netdev,x-systemd.device-timeout=0,x-systemd.requires=vdo.service) manually into fstab. gluster-ansible would have added it if you had done end-to-end deployment or declared the necessary variables. -sac ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/A4X6RNSL6IK3G7YEL23CQDU7KMDVUNRY/
[ovirt-users] Re: oVirt node loses gluster volume UUID after reboot, goes to emergency mode every time I reboot.
On Wed, May 22, 2019 at 11:26 AM Sahina Bose wrote: > +Sachidananda URS > > On Wed, May 22, 2019 at 1:14 AM wrote: > >> I'm sorry, i'm still working on my linux knowledge, here is the output of >> my blkid on one of the servers: >> >> /dev/nvme0n1: PTTYPE="dos" >> /dev/nvme1n1: PTTYPE="dos" >> /dev/mapper/eui.6479a71892882020: PTTYPE="dos" >> /dev/mapper/eui.0025385881b40f60: PTTYPE="dos" >> /dev/mapper/eui.6479a71892882020p1: >> UUID="pfJiP3-HCgP-gCyQ-UIzT-akGk-vRpV-aySGZ2" TYPE="LVM2_member" >> /dev/mapper/eui.0025385881b40f60p1: >> UUID="Q0fyzN-9q0s-WDLe-r0IA-MFY0-tose-yzZeu2" TYPE="LVM2_member" >> >> /dev/mapper/Samsung_SSD_850_EVO_1TB_S21CNXAG615134H: PTTYPE="dos" >> /dev/mapper/Samsung_SSD_850_EVO_1TB_S21CNXAG615134H1: >> UUID="lQrtPt-nx0u-P6Or-f2YW-sN2o-jK9I-gp7P2m" TYPE="LVM2_member" >> /dev/mapper/vg_gluster_ssd-lv_gluster_ssd: >> UUID="890feffe-c11b-4c01-b839-a5906ab39ecb" TYPE="vdo" >> /dev/mapper/vg_gluster_nvme1-lv_gluster_nvme1: >> UUID="7049fd2a-788d-44cb-9dc5-7b4c0ee309fb" TYPE="vdo" >> /dev/mapper/vg_gluster_nvme2-lv_gluster_nvme2: >> UUID="2c541b70-32c5-496e-863f-ea68b50e7671" TYPE="vdo" >> /dev/mapper/vdo_gluster_ssd: UUID="e59a68d5-2b73-487a-ac5e-409e11402ab5" >> TYPE="xfs" >> /dev/mapper/vdo_gluster_nvme1: >> UUID="d5f53f17-bca1-4cb9-86d5-34a468c062e7" TYPE="xfs" >> /dev/mapper/vdo_gluster_nvme2: >> UUID="40a41b5f-be87-4994-b6ea-793cdfc076a4" TYPE="xfs" >> >> #2 >> /dev/nvme0n1: PTTYPE="dos" >> /dev/nvme1n1: PTTYPE="dos" >> /dev/mapper/eui.6479a71892882020: PTTYPE="dos" >> /dev/mapper/eui.6479a71892882020p1: >> UUID="GiBSqT-JJ3r-Tn3X-lzCr-zW3D-F3IE-OpE4Ga" TYPE="LVM2_member" >> /dev/mapper/nvme.126f-324831323230303337383138-4144415441205358383030304e50-0001: >> PTTYPE="dos" >> /dev/sda: PTTYPE="gpt" >> /dev/mapper/nvme.126f-324831323230303337383138-4144415441205358383030304e50-0001p1: >> UUID="JBhj79-Uk0E-DdLE-Ibof-VwBq-T5nZ-F8d57O" TYPE="LVM2_member" >> /dev/sdb: PTTYPE="dos" >> /dev/mapper/Samsung_SSD_860_EVO_1TB_S3Z8NB0K843638B: PTTYPE="dos" >> /dev/mapper/Samsung_SSD_860_EVO_1TB_S3Z8NB0K843638B1: >> UUID="6yp5YM-D1be-M27p-AEF5-w1pv-uXNF-2vkiJZ" TYPE="LVM2_member" >> /dev/mapper/vg_gluster_ssd-lv_gluster_ssd: >> UUID="9643695c-0ace-4cba-a42c-3f337a7d5133" TYPE="vdo" >> /dev/mapper/vg_gluster_nvme2-lv_gluster_nvme2: >> UUID="79f5bacc-cbe7-4b67-be05-414f68818f41" TYPE="vdo" >> /dev/mapper/vg_gluster_nvme1-lv_gluster_nvme1: >> UUID="2438a550-5fb4-48f4-a5ef-5cff5e7d5ba8" TYPE="vdo" >> /dev/mapper/vdo_gluster_ssd: UUID="5bb67f61-9d14-4d0b-8aa4-ae3905276797" >> TYPE="xfs" >> /dev/mapper/vdo_gluster_nvme1: >> UUID="732f939c-f133-4e48-8dc8-c9d21dbc0853" TYPE="xfs" >> /dev/mapper/vdo_gluster_nvme2: >> UUID="f55082ca-1269-4477-9bf8-7190f1add9ef" TYPE="xfs" >> >> #3 >> /dev/nvme1n1: UUID="8f1dc44e-f35f-438a-9abc-54757fd7ef32" TYPE="vdo" >> /dev/nvme0n1: PTTYPE="dos" >> /dev/mapper/nvme.c0a9-313931304531454644323630-4354353030503153534438-0001: >> UUID="8f1dc44e-f35f-438a-9abc-54757fd7ef32" TYPE="vdo" >> /dev/mapper/eui.6479a71892882020: PTTYPE="dos" >> /dev/mapper/eui.6479a71892882020p1: >> UUID="FwBRJJ-ofHI-1kHq-uEf1-H3Fn-SQcw-qWYvmL" TYPE="LVM2_member" >> /dev/sda: PTTYPE="gpt" >> /dev/mapper/Samsung_SSD_850_EVO_1TB_S2RENX0J302798A: PTTYPE="gpt" >> /dev/mapper/Samsung_SSD_850_EVO_1TB_S2RENX0J302798A1: >> UUID="weCmOq-VZ1a-Itf5-SOIS-AYLp-Ud5N-S1H2bR" TYPE="LVM2_member" >> PARTUUID="920ef5fd-e525-4cf0-99d5-3951d3013c19" >> /dev/mapper/vg_gluster_ssd-lv_gluster_ssd: >> UUID="fbaffbde-74f0-4e4a-9564-64ca84398cde" TYPE="vdo" >> /dev/mapper/vg_gluster_nvme2-lv_gluster_nvme2: >> UUID="ae0bd2ad-7da9-485b-824a-72038571c5ba" TYPE="vdo" >> /dev/mapper/vdo_gluster_ssd: UUID="f0f56784-bc71-46c7-8bfe-6b71327c87c9" >> TYPE="xfs" >> /dev/mapper/vdo_gluster_nvme1: >> UUID="0ddc1180-f228-4209-82f1-1607a46aed1f" TYPE="xfs" >> /dev
[ovirt-users] Re: Scale out ovirt 4.3 (from 3 to 6 or 9 nodes) with hyperconverged setup and Gluster
On Tue, May 21, 2019 at 9:00 PM Adrian Quintero wrote: > Sac, > > 6.-started the hyperconverged setup wizard and added* > "gluster_features_force_varlogsizecheck: false"* to the "vars:" section > on the Generated Ansible inventory : > */etc/ansible/hc_wizard_inventory.yml* file as it was complaining about > /var/log messages LV. > In the upcoming release I plan to remove this check. Since we will go ahead with logrotate. > > *EUREKA: *After doing the above I was able to get past the filter issues, > however I am still concerned if during a reboot the disks might come up > differently. For example /dev/sdb might come up as /dev/sdx... > > Even this shouldn't be a problem going forward, since we will use UUID to mount the devices. And the device name change shouldn't matter. Thanks for your feedback, I will see how we can improve the install experience. -sac ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/EKAHNGN74NFAUNMZ7RITPNEXVDATW2Y3/
[ovirt-users] Re: Scale out ovirt 4.3 (from 3 to 6 or 9 nodes) with hyperconverged setup and Gluster
On Tue, May 21, 2019 at 12:16 PM Sahina Bose wrote: > > > On Mon, May 20, 2019 at 9:55 PM Adrian Quintero > wrote: > >> Sahina, >> Yesterday I started with a fresh install, I completely wiped clean all >> the disks, recreated the arrays from within my controller of our DL380 Gen >> 9's. >> >> OS: RAID 1 (2x600GB HDDs): /dev/sda// Using ovirt node 4.3.3.1 iso. >> engine and VMSTORE1: JBOD (1x3TB HDD):/dev/sdb >> DATA1: JBOD (1x3TB HDD): /dev/sdc >> DATA2: JBOD (1x3TB HDD): /dev/sdd >> Caching disk: JOBD (1x440GB SDD): /dev/sde >> >> *After the OS install on the first 3 servers and setting up ssh keys, I >> started the Hyperconverged deploy process:* >> 1.-Logged int to the first server http://host1.example.com:9090 >> 2.-Selected Hyperconverged, clicked on "Run Gluster Wizard" >> 3.-Followed the wizard steps (Hosts, FQDNs, Packages, Volumes, Bricks, >> Review) >> *Hosts/FQDNs:* >> host1.example.com >> host2.example.com >> host3.example.com >> *Packages:* >> *Volumes:* >> engine:replicate:/gluster_bricks/engine/engine >> vmstore1:replicate:/gluster_bricks/vmstore1/vmstore1 >> data1:replicate:/gluster_bricks/data1/data1 >> data2:replicate:/gluster_bricks/data2/data2 >> *Bricks:* >> engine:/dev/sdb:100GB:/gluster_bricks/engine >> vmstore1:/dev/sdb:2600GB:/gluster_bricks/vmstrore1 >> data1:/dev/sdc:2700GB:/gluster_bricks/data1 >> data2:/dev/sdd:2700GB:/gluster_bricks/data2 >> LV Cache: >> /dev/sde:400GB:writethrough >> 4.-After I hit deploy on the last step of the "Wizard" that is when I get >> the disk filter error. >> TASK [gluster.infra/roles/backend_setup : Create volume groups] >> >> failed: [vmm10.virt.iad3p] (item={u'vgname': u'gluster_vg_sdb', >> u'pvname': u'/dev/sdb'}) => {"changed": false, "err": " Device /dev/sdb >> excluded by a filter.\n", "item": {"pvname": "/dev/sdb", "vgname": >> "gluster_vg_sdb"}, "msg": "Creating physical volume '/dev/sdb' failed", >> "rc": 5} >> failed: [vmm12.virt.iad3p] (item={u'vgname': u'gluster_vg_sdb', >> u'pvname': u'/dev/sdb'}) => {"changed": false, "err": " Device /dev/sdb >> excluded by a filter.\n", "item": {"pvname": "/dev/sdb", "vgname": >> "gluster_vg_sdb"}, "msg": "Creating physical volume '/dev/sdb' failed", >> "rc": 5} >> failed: [vmm11.virt.iad3p] (item={u'vgname': u'gluster_vg_sdb', >> u'pvname': u'/dev/sdb'}) => {"changed": false, "err": " Device /dev/sdb >> excluded by a filter.\n", "item": {"pvname": "/dev/sdb", "vgname": >> "gluster_vg_sdb"}, "msg": "Creating physical volume '/dev/sdb' failed", >> "rc": 5} >> failed: [vmm12.virt.iad3p] (item={u'vgname': u'gluster_vg_sdc', >> u'pvname': u'/dev/sdc'}) => {"changed": false, "err": " Device /dev/sdc >> excluded by a filter.\n", "item": {"pvname": "/dev/sdc", "vgname": >> "gluster_vg_sdc"}, "msg": "Creating physical volume '/dev/sdc' failed", >> "rc": 5} >> failed: [vmm10.virt.iad3p] (item={u'vgname': u'gluster_vg_sdc', >> u'pvname': u'/dev/sdc'}) => {"changed": false, "err": " Device /dev/sdc >> excluded by a filter.\n", "item": {"pvname": "/dev/sdc", "vgname": >> "gluster_vg_sdc"}, "msg": "Creating physical volume '/dev/sdc' failed", >> "rc": 5} >> failed: [vmm11.virt.iad3p] (item={u'vgname': u'gluster_vg_sdc', >> u'pvname': u'/dev/sdc'}) => {"changed": false, "err": " Device /dev/sdc >> excluded by a filter.\n", "item": {"pvname": "/dev/sdc", "vgname": >> "gluster_vg_sdc"}, "msg": "Creating physical volume '/dev/sdc' failed", >> "rc": 5} >> failed: [vmm10.virt.iad3p] (item={u'vgname': u'gluster_vg_sdd', >> u'pvname': u'/dev/sdd
[ovirt-users] Re: oVirt node loses gluster volume UUID after reboot, goes to emergency mode every time I reboot.
On Mon, May 20, 2019 at 11:58 AM Sahina Bose wrote: > Adding Sachi > > On Thu, May 9, 2019 at 2:01 AM wrote: > >> This only started to happen with oVirt node 4.3, 4.2 didn't have issue. >> Since I updated to 4.3, every reboot the host goes into emergency mode. >> First few times this happened I re-installed O/S from scratch, but after >> some digging I found out that the drives it mounts in /etc/fstab cause the >> problem, specifically these mounts. All three are single drives, one is an >> SSD and the other 2 are individual NVME drives. >> >> UUID=732f939c-f133-4e48-8dc8-c9d21dbc0853 /gluster_bricks/storage_nvme1 >> auto defaults 0 0 >> UUID=5bb67f61-9d14-4d0b-8aa4-ae3905276797 /gluster_bricks/storage_ssd >> auto defaults 0 0 >> UUID=f55082ca-1269-4477-9bf8-7190f1add9ef /gluster_bricks/storage_nvme2 >> auto defaults 0 0 >> >> In order to get the host to actually boot, I have to go to console, >> delete those mounts, reboot, and then re-add them, and they end up with new >> UUIDs. all of these hosts reliably rebooted in 4.2 and earlier, but all >> the versions of 4.3 have this same problem (I keep updating to hope issue >> is fixed). >> > Hello Michael, I need your help in resolving this. I would like to understand if the environment is affecting something. What is the out put of: # blkid /dev/vgname/lvname For the three bricks you have. And also what is the error you see when you run the command # mount /gluster_bricks/storage_nvme1 # mount /gluster_bricks/storage_ssd Also can you please attach your variable file and playbook? In my setup things work fine, which is making it difficult for me to fix. -sac ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/YDJCWINE3UP4CKGFD3QARN67TBQOHRTQ/
[ovirt-users] Re: ovirt-node 4.2 iso - hyperconverged wizard doesn't write gdeployConfig settings
Hi, On Thu, Feb 7, 2019 at 9:27 AM Sahina Bose wrote: > +Sachidananda URS to review user request about systemd mount files > > On Tue, Feb 5, 2019 at 10:22 PM feral wrote: > > > > Using SystemD makes way more sense to me. I was just trying to use > ovirt-node as it was ... intended? Mainly because I have no idea how it all > works yet, so I've been trying to do the most stockish deployment possible, > following deployment instructions and not thinking I'm smarter than the > software :p. > > I've given up on 4.2 for now, as 4.3 was just released, so giving that a > try now. Will report back. Hopefully 4.3 enlists systemd for stuff? > > > Unless we have really complicated mount setup, it is better to use fstab. We had certain difficulties while using vdo, maybe for such cases? However the systemd.mount(5) manpage suggests that the preferred way of mount configuration should be /etc/fstab. src: https://manpages.debian.org/jessie/systemd/systemd.mount.5.en.html#/ETC/FSTAB /ETC/FSTAB Mount units may either be configured via unit files, or via /etc/fstab (seefstab(5) for details). Mounts listed in /etc/fstab will be converted into native units dynamically at boot and when the configuration of the system manager is reloaded. In general, configuring mount points through /etc/fstab is the preferred approach. See systemd-fstab-generator(8) for details about the conversion. > > On Tue, Feb 5, 2019 at 4:33 AM Strahil Nikolov > wrote: > >> > >> Dear Feral, > >> > >> >On that note, have you also had issues with gluster not restarting on > reboot, as well as >all of the HA stuff failing on reboot after power loss? > Thus far, the only way I've got the >cluster to come back to life, is to > manually restart glusterd on all nodes, then put the >cluster back into > "not mainentance" mode, and then manually starting the hosted-engine vm. > >This also fails after 2 or 3 power losses, even though the entire cluster > is happy through >the first 2. > >> > >> > >> About the gluster not starting - use systemd.mount unit files. > >> here is my setup and for now works: > >> > >> [root@ovirt2 yum.repos.d]# systemctl cat gluster_bricks-engine.mount > >> # /etc/systemd/system/gluster_bricks-engine.mount > >> [Unit] > >> Description=Mount glusterfs brick - ENGINE > >> Requires = vdo.service > >> After = vdo.service > >> Before = glusterd.service > >> Conflicts = umount.target > >> > >> [Mount] > >> What=/dev/mapper/gluster_vg_md0-gluster_lv_engine > >> Where=/gluster_bricks/engine > >> Type=xfs > >> Options=inode64,noatime,nodiratime > >> > >> [Install] > >> WantedBy=glusterd.service > >> [root@ovirt2 yum.repos.d]# systemctl cat > gluster_bricks-engine.automount > >> # /etc/systemd/system/gluster_bricks-engine.automount > >> [Unit] > >> Description=automount for gluster brick ENGINE > >> > >> [Automount] > >> Where=/gluster_bricks/engine > >> > >> [Install] > >> WantedBy=multi-user.target > >> [root@ovirt2 yum.repos.d]# systemctl cat glusterd > >> # /etc/systemd/system/glusterd.service > >> [Unit] > >> Description=GlusterFS, a clustered file-system server > >> Requires=rpcbind.service gluster_bricks-engine.mount > gluster_bricks-data.mount gluster_bricks-isos.mount > >> After=network.target rpcbind.service gluster_bricks-engine.mount > gluster_bricks-data.mount gluster_bricks-isos.mount > >> Before=network-online.target > >> > >> [Service] > >> Type=forking > >> PIDFile=/var/run/glusterd.pid > >> LimitNOFILE=65536 > >> Environment="LOG_LEVEL=INFO" > >> EnvironmentFile=-/etc/sysconfig/glusterd > >> ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level > $LOG_LEVEL $GLUSTERD_OPTIONS > >> KillMode=process > >> SuccessExitStatus=15 > >> > >> [Install] > >> WantedBy=multi-user.target > >> > >> # /etc/systemd/system/glusterd.service.d/99-cpu.conf > >> [Service] > >> CPUAccounting=yes > >> Slice=glusterfs.slice > >> > >> > >> Best Regards, > >> Strahil Nikolov > > > > > > > > -- > > _ > > Fact: > > 1. Ninjas are mammals. > > 2. Ninjas fight ALL the time. > > 3. The purpose of the ninja is to flip out and kill people. > > ___ > > Users mailing list -- users@ovirt.org > > To unsubscribe send an email to users-le...@ovirt.org > > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/G4AE6YQHYL7XBTYNCLQPFQY6CY6C7YGX/ > ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/JMBK3F3LGTYUQ4MAS7GP5JM2ONTY7HCT/
[ovirt-users] Re: Deploying single instance - error
On Thu, Jan 31, 2019 at 12:48 PM Strahil Nikolov wrote: > Hi All, > > I have managed to fix this by reinstalling gdeploy package. Yet, it still > asks for "Disckount" section - but as the fix was not rolled for CentOS yet > - this is expected. > Till the CentOS team includes the package, you can provide the diskcount as workaround. This anyway will not be used (no side-effects). -sac ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/B4CSERLR5JKP67IKBJ2U3JUUVCNBWJKU/
[ovirt-users] Re: Deploying single instance - error
On Thu, Jan 31, 2019 at 8:01 AM Strahil Nikolov wrote: > Hey Guys/Gals, > > did you update the gdeploy for CentOS ? > gdeploy is updated for Fedora, for CentOS the packages will be updated shortly, we are testing the packages. However, this issue you are facing where RAID is selected over JBOD is strange. Gobinda will look into this, and might need more details. > It seems to not be working - now it doesn't honour the whole cockpit > wizard. > Instead of JBOD - it selects raid6, instead of md0 - it uses sdb , etc. > [root@ovirt1 ~]# gdeploy --version > gdeploy 2.0.2 > [root@ovirt1 ~]# rpm -qa gdeploy > gdeploy-2.0.8-1.el7.noarch > > Note: This is a fresh install. > > Best Regards, > Strahil Nikolov > ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/544UURCG4NBBIGYN6L7FCTOJO3IBIVK7/
[ovirt-users] Re: Deploying single instance - error
Hi David, On Mon, Jan 28, 2019 at 5:01 PM Gobinda Das wrote: > Hi David, > Thanks! > Adding sac to check if we are missing anything for gdeploy. > > On Mon, Jan 28, 2019 at 4:33 PM Leo David wrote: > >> Hi Gobinda, >> gdeploy --version >> gdeploy 2.0.2 >> >> yum list installed | grep gdeploy >> gdeploy.noarch2.0.8-1.el7 >> installed >> >> Ramakrishna will build a fedora package to include that fix. Should be available to you in some time. Will keep you posted. -sac ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/5ZCWCEQKHJ5BSMLUBQD7O5STO6LGUPUG/
Re: [ovirt-users] Hosted Engine Setup with the gluster bricks on the same disk as the OS
Hi, On Thu, May 18, 2017 at 7:08 PM, Sahina Bose wrote: > > > On Thu, May 18, 2017 at 3:20 PM, Mike DePaulo > wrote: > >> Well, I tried both of the following: >> 1. Having only a boot partition and a PV for the OS that does not take >> up the entire disk, and then specifying "sda" in Hosted Engine Setup. >> 2. Having not only a boot partition and a PV for the OS, but also an >> empty (and not formatted) /dev/sda3 PV that I created with fdisk. >> Then, specfying "sda3" in Hosted Engine Setup. >> >> Both attempts resulted in errors like this: >> failed: [centerpoint.ad.depaulo.org] (item=/dev/sda3) => {"failed": >> true, "failed_when_result": true, "item": "/dev/sda3", "msg": " >> Device /dev/sda3 not found (or ignored by filtering).\n", "rc": 5} >> > > Can you provide gdeploy logs? I think, it's at ~/.gdeploy/gdeploy.log > > >> >> It seems like having gluster bricks on the same disk as the OS doesn't >> work at all. >> >> Hi, /dev/sda3 should work, the error here is possibly due to filesystem signature. Can you please set wipefs=yes? For example [pv] action=create wipefs=yes devices=/dev/sda3 -sac ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [ovirt-users] gdeploy error
Hi, On Wed, Feb 15, 2017 at 2:30 PM, Ramesh Nachimuthu wrote: > > + Sac, > > > - Original Message - > > From: "Sandro Bonazzola" > > To: "Ishmael Tsoaela" , "Ramesh Nachimuthu" < > rnach...@redhat.com> > > Cc: "users" > > Sent: Wednesday, February 15, 2017 1:52:26 PM > > Subject: Re: [ovirt-users] gdeploy error > > > > On Tue, Feb 14, 2017 at 3:52 PM, Ishmael Tsoaela > > wrote: > > > > > Hi, > > > > > > > > > I am a new sys admin and trying to install glusterfs using gdeploy, I > > > started with a simple script to enable a service(ntpd). > > > > > > > > > gdeploy --version > > > gdeploy 2.0.1 > > > > > > [root@ovirt1 gdeploy]# cat ntp.conf > > > [hosts] > > > ovirt1 > > > > > > [service1] > > > action=enable > > > service=ntpd > > > > > > [service2] > > > action=start > > > service=ntpd > > > > > > > > > > > > The issue is that the gdeploy is returning error > > > fatal: [ovirt1]: FAILED! => {"failed": true, "msg": "module (setup) is > > > missing interpreter line"} > > > > > > > > > Is there a simple way to debug or figure out how to fix this error? > This is because of a conflicting module. I think there is another service.py in the python library path. Can you please find that and move it momentarily? Just to ensure that is the case. -sac ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users