[ovirt-users] Re: oVirt node loses gluster volume UUID after reboot, goes to emergency mode every time I reboot.

2019-05-21 Thread Sahina Bose
+Sachidananda URS 

On Wed, May 22, 2019 at 1:14 AM  wrote:

> I'm sorry, i'm still working on my linux knowledge, here is the output of
> my blkid on one of the servers:
>
> /dev/nvme0n1: PTTYPE="dos"
> /dev/nvme1n1: PTTYPE="dos"
> /dev/mapper/eui.6479a71892882020: PTTYPE="dos"
> /dev/mapper/eui.0025385881b40f60: PTTYPE="dos"
> /dev/mapper/eui.6479a71892882020p1:
> UUID="pfJiP3-HCgP-gCyQ-UIzT-akGk-vRpV-aySGZ2" TYPE="LVM2_member"
> /dev/mapper/eui.0025385881b40f60p1:
> UUID="Q0fyzN-9q0s-WDLe-r0IA-MFY0-tose-yzZeu2" TYPE="LVM2_member"
>
> /dev/mapper/Samsung_SSD_850_EVO_1TB_S21CNXAG615134H: PTTYPE="dos"
> /dev/mapper/Samsung_SSD_850_EVO_1TB_S21CNXAG615134H1:
> UUID="lQrtPt-nx0u-P6Or-f2YW-sN2o-jK9I-gp7P2m" TYPE="LVM2_member"
> /dev/mapper/vg_gluster_ssd-lv_gluster_ssd:
> UUID="890feffe-c11b-4c01-b839-a5906ab39ecb" TYPE="vdo"
> /dev/mapper/vg_gluster_nvme1-lv_gluster_nvme1:
> UUID="7049fd2a-788d-44cb-9dc5-7b4c0ee309fb" TYPE="vdo"
> /dev/mapper/vg_gluster_nvme2-lv_gluster_nvme2:
> UUID="2c541b70-32c5-496e-863f-ea68b50e7671" TYPE="vdo"
> /dev/mapper/vdo_gluster_ssd: UUID="e59a68d5-2b73-487a-ac5e-409e11402ab5"
> TYPE="xfs"
> /dev/mapper/vdo_gluster_nvme1: UUID="d5f53f17-bca1-4cb9-86d5-34a468c062e7"
> TYPE="xfs"
> /dev/mapper/vdo_gluster_nvme2: UUID="40a41b5f-be87-4994-b6ea-793cdfc076a4"
> TYPE="xfs"
>
> #2
> /dev/nvme0n1: PTTYPE="dos"
> /dev/nvme1n1: PTTYPE="dos"
> /dev/mapper/eui.6479a71892882020: PTTYPE="dos"
> /dev/mapper/eui.6479a71892882020p1:
> UUID="GiBSqT-JJ3r-Tn3X-lzCr-zW3D-F3IE-OpE4Ga" TYPE="LVM2_member"
> /dev/mapper/nvme.126f-324831323230303337383138-4144415441205358383030304e50-0001:
> PTTYPE="dos"
> /dev/sda: PTTYPE="gpt"
> /dev/mapper/nvme.126f-324831323230303337383138-4144415441205358383030304e50-0001p1:
> UUID="JBhj79-Uk0E-DdLE-Ibof-VwBq-T5nZ-F8d57O" TYPE="LVM2_member"
> /dev/sdb: PTTYPE="dos"
> /dev/mapper/Samsung_SSD_860_EVO_1TB_S3Z8NB0K843638B: PTTYPE="dos"
> /dev/mapper/Samsung_SSD_860_EVO_1TB_S3Z8NB0K843638B1:
> UUID="6yp5YM-D1be-M27p-AEF5-w1pv-uXNF-2vkiJZ" TYPE="LVM2_member"
> /dev/mapper/vg_gluster_ssd-lv_gluster_ssd:
> UUID="9643695c-0ace-4cba-a42c-3f337a7d5133" TYPE="vdo"
> /dev/mapper/vg_gluster_nvme2-lv_gluster_nvme2:
> UUID="79f5bacc-cbe7-4b67-be05-414f68818f41" TYPE="vdo"
> /dev/mapper/vg_gluster_nvme1-lv_gluster_nvme1:
> UUID="2438a550-5fb4-48f4-a5ef-5cff5e7d5ba8" TYPE="vdo"
> /dev/mapper/vdo_gluster_ssd: UUID="5bb67f61-9d14-4d0b-8aa4-ae3905276797"
> TYPE="xfs"
> /dev/mapper/vdo_gluster_nvme1: UUID="732f939c-f133-4e48-8dc8-c9d21dbc0853"
> TYPE="xfs"
> /dev/mapper/vdo_gluster_nvme2: UUID="f55082ca-1269-4477-9bf8-7190f1add9ef"
> TYPE="xfs"
>
> #3
> /dev/nvme1n1: UUID="8f1dc44e-f35f-438a-9abc-54757fd7ef32" TYPE="vdo"
> /dev/nvme0n1: PTTYPE="dos"
> /dev/mapper/nvme.c0a9-313931304531454644323630-4354353030503153534438-0001:
> UUID="8f1dc44e-f35f-438a-9abc-54757fd7ef32" TYPE="vdo"
> /dev/mapper/eui.6479a71892882020: PTTYPE="dos"
> /dev/mapper/eui.6479a71892882020p1:
> UUID="FwBRJJ-ofHI-1kHq-uEf1-H3Fn-SQcw-qWYvmL" TYPE="LVM2_member"
> /dev/sda: PTTYPE="gpt"
> /dev/mapper/Samsung_SSD_850_EVO_1TB_S2RENX0J302798A: PTTYPE="gpt"
> /dev/mapper/Samsung_SSD_850_EVO_1TB_S2RENX0J302798A1:
> UUID="weCmOq-VZ1a-Itf5-SOIS-AYLp-Ud5N-S1H2bR" TYPE="LVM2_member"
> PARTUUID="920ef5fd-e525-4cf0-99d5-3951d3013c19"
> /dev/mapper/vg_gluster_ssd-lv_gluster_ssd:
> UUID="fbaffbde-74f0-4e4a-9564-64ca84398cde" TYPE="vdo"
> /dev/mapper/vg_gluster_nvme2-lv_gluster_nvme2:
> UUID="ae0bd2ad-7da9-485b-824a-72038571c5ba" TYPE="vdo"
> /dev/mapper/vdo_gluster_ssd: UUID="f0f56784-bc71-46c7-8bfe-6b71327c87c9"
> TYPE="xfs"
> /dev/mapper/vdo_gluster_nvme1: UUID="0ddc1180-f228-4209-82f1-1607a46aed1f"
> TYPE="xfs"
> /dev/mapper/vdo_gluster_nvme2: UUID="bcb7144a-6ce0-4b3f-9537-f465c46d4843"
> TYPE="xfs"
>
> I don't have any errors on mount until I reboot, and once I reboot it
> takes ~6hrs for everything to work 100% since I have to delete the mount
> commands out of stab for the 3 gluster volumes and reboot.  I'da rather
> wait until the next update to do that.
>
> I don't have a variable file or playbook since I made the storage
> manually, I stopped using the playbook since at that point I couldn't
> enable RDMA or over-provision the disks correctly unless I made them
> manually.  But as I said, this is something in 4.3.3 as if I go back to
> 4.3.2 I can reboot no problem.
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/EDGJVIYPMHN5HYARBNCN36NRSTKMSLLW/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 

[ovirt-users] Re: [ovirt-announce] Re: [ANN] oVirt 4.3.4 First Release Candidate is now available

2019-05-21 Thread Krutika Dhananjay
On Tue, May 21, 2019 at 8:13 PM Strahil  wrote:

> Dear Krutika,
>
> Yes I did but I use 6 ports (1 gbit/s each) and this is the reason that
> reads get slower.
> Do you know a way to force gluster to open more connections (client to
> server & server to server)?
>

The idea was explored sometime back here -
https://review.gluster.org/c/glusterfs/+/19133
But there were some issues that were identified with the approach, so it
had to be dropped.

-Krutika

Thanks for the detailed explanation.
>
> Best Regards,
> Strahil Nikolov
> On May 21, 2019 08:36, Krutika Dhananjay  wrote:
>
> So in our internal tests (with nvme ssd drives, 10g n/w), we found read
> performance to be better with choose-local
> disabled in hyperconverged setup.  See
> https://bugzilla.redhat.com/show_bug.cgi?id=1566386 for more information.
>
> With choose-local off, the read replica is chosen randomly (based on hash
> value of the gfid of that shard).
> And when it is enabled, the reads always go to the local replica.
> We attributed better performance with the option disabled to bottlenecks
> in gluster's rpc/socket layer. Imagine all read
> requests lined up to be sent over the same mount-to-brick connection as
> opposed to (nearly) randomly getting distributed
> over three (because replica count = 3) such connections.
>
> Did you run any tests that indicate "choose-local=on" is giving better
> read perf as opposed to when it's disabled?
>
> -Krutika
>
> On Sun, May 19, 2019 at 5:11 PM Strahil Nikolov 
> wrote:
>
> Ok,
>
> so it seems that Darell's case and mine are different as I use vdo.
>
> Now I have destroyed Storage Domains, gluster volumes and vdo and
> recreated again (4 gluster volumes on a single vdo).
> This time vdo has '--emulate512=true' and no issues have been observed.
>
> Gluster volume options before 'Optimize for virt':
>
> Volume Name: data_fast
> Type: Replicate
> Volume ID: 378804bf-2975-44d8-84c2-b541aa87f9ef
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster1:/gluster_bricks/data_fast/data_fast
> Brick2: gluster2:/gluster_bricks/data_fast/data_fast
> Brick3: ovirt3:/gluster_bricks/data_fast/data_fast (arbiter)
> Options Reconfigured:
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: off
> cluster.enable-shared-storage: enable
>
> Gluster volume after 'Optimize for virt':
>
> Volume Name: data_fast
> Type: Replicate
> Volume ID: 378804bf-2975-44d8-84c2-b541aa87f9ef
> Status: Stopped
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gluster1:/gluster_bricks/data_fast/data_fast
> Brick2: gluster2:/gluster_bricks/data_fast/data_fast
> Brick3: ovirt3:/gluster_bricks/data_fast/data_fast (arbiter)
> Options Reconfigured:
> network.ping-timeout: 30
> performance.strict-o-direct: on
> storage.owner-gid: 36
> storage.owner-uid: 36
> server.event-threads: 4
> client.event-threads: 4
> cluster.choose-local: off
> user.cifs: off
> features.shard: on
> cluster.shd-wait-qlength: 1
> cluster.shd-max-threads: 8
> cluster.locking-scheme: granular
> cluster.data-self-heal-algorithm: full
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> cluster.eager-lock: enable
> network.remote-dio: off
> performance.low-prio-threads: 32
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> transport.address-family: inet
> nfs.disable: on
> performance.client-io-threads: on
> cluster.enable-shared-storage: enable
>
> After that adding the volumes as storage domains (via UI) worked without
> any issues.
>
> Can someone clarify why we have now 'cluster.choose-local: off' when in
> oVirt 4.2.7 (gluster v3.12.15) we didn't have that ?
> I'm using storage that is faster than network and reading from local brick
> gives very high read speed.
>
> Best Regards,
> Strahil Nikolov
>
>
>
> В неделя, 19 май 2019 г., 9:47:27 ч. Г�
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OHWZ7Y3T7QKP6CVCC34KDOFSXVILJ332/


[ovirt-users] Re: Supporting comments on ovirt-site Blog section

2019-05-21 Thread aadilkhan9409
The postings on your site are always excellent 
https://enablecookieswindows10.com/ Thanks for the great share and keep up this 
great work
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PWNYFDGXCXIMTOWGVYOXRFFI7BQMEU2Z/


[ovirt-users] Re: Scale out ovirt 4.3 (from 3 to 6 or 9 nodes) with hyperconverged setup and Gluster

2019-05-21 Thread Sachidananda URS
On Tue, May 21, 2019 at 9:00 PM Adrian Quintero 
wrote:

> Sac,
>
> 6.-started the hyperconverged setup wizard and added*
> "gluster_features_force_varlogsizecheck: false"* to the "vars:" section
> on the  Generated Ansible inventory :
> */etc/ansible/hc_wizard_inventory.yml* file as it was complaining about
> /var/log messages LV.
>

In the upcoming release I plan to remove this check. Since we will go ahead
with logrotate.


>
> *EUREKA: *After doing the above I was able to get past the filter issues,
> however I am still concerned if during a reboot the disks might come up
> differently. For example /dev/sdb might come up as /dev/sdx...
>
>
Even this shouldn't be a problem going forward, since we will use UUID to
mount the devices.
And the device name change shouldn't matter.

Thanks for your feedback, I will see how we can improve the install
experience.

-sac
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/EKAHNGN74NFAUNMZ7RITPNEXVDATW2Y3/


[ovirt-users] Gluster rebuild: request suggestions (poor IO performance)

2019-05-21 Thread Jim Kusznir
Hi:

I've been having one heck of a time for nearly the entire time I've been
running ovirt with disk IO performance.  I've tried a variety of things,
I've posted to this list for help several times, and it sounds like in most
cases the problems are due to design decisions and such.

My cluster has been devolving into nearly unusable performance, and I
believe its mostly disk IO related.  I'm currently using FreeNAS as my
primary VM storage (via NFS), but now it too is performing slowly (it
started out reasonable, but slowly devolved for unknown reasons).

I'm ready to switch back to gluster if I can get specific recommendations
as to what I need to do to make it work.  I feel like I've been trying
random things, and sinking money into this to try and make it work, but
nothing has really fixed the problem.

I have 3 Dell R610 servers with 750GB SSDs as their primary drive.  I had
used some Seagate SSHDs, but the internal Dell DRAC raid controller (which
had been configured to pass them through as a single disk volume, but still
wasn't really JBOD), but it started silently failing them, and causing
major issues for gluster.  I think the DRAC just doesn't like those HDDs.

I can put some real spinning disks in; perhaps a RAID-1 pair of 2TB?  These
servers only take 2.5" hdd's, so that greatly limits my options.

I'm sure others out there are using Dell R610 servers...what do  you use
for storage?  How does it perform?  What do I need to do to get this
cluster actually usable again?  Are PERC-6i storage controllers usable?
I'm not even sure where to go troubleshooting now...everything is so
slw.

BTW: I had a small data volume on the SSDs, and the gluster performance on
those was pretty poor.  performance of the hosted engine is pretty poor
still, and it is still on the SSDs.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/IGR3RDAKQYXSPGAQCHWS5SGKOYA4QKJY/


[ovirt-users] Re: oVirt node loses gluster volume UUID after reboot, goes to emergency mode every time I reboot.

2019-05-21 Thread Strahil Nikolov
 Do you use VDO ?If yes, consider setting up systemd ".mount" units, as this is 
the only way to setup dependencies.
Best Regards,Strahil Nikolov

В вторник, 21 май 2019 г., 22:44:06 ч. Гринуич+3, mich...@wanderingmad.com 
 написа:  
 
 I'm sorry, i'm still working on my linux knowledge, here is the output of my 
blkid on one of the servers:

/dev/nvme0n1: PTTYPE="dos"
/dev/nvme1n1: PTTYPE="dos"
/dev/mapper/eui.6479a71892882020: PTTYPE="dos"
/dev/mapper/eui.0025385881b40f60: PTTYPE="dos"
/dev/mapper/eui.6479a71892882020p1: 
UUID="pfJiP3-HCgP-gCyQ-UIzT-akGk-vRpV-aySGZ2" TYPE="LVM2_member"
/dev/mapper/eui.0025385881b40f60p1: 
UUID="Q0fyzN-9q0s-WDLe-r0IA-MFY0-tose-yzZeu2" TYPE="LVM2_member"

/dev/mapper/Samsung_SSD_850_EVO_1TB_S21CNXAG615134H: PTTYPE="dos"
/dev/mapper/Samsung_SSD_850_EVO_1TB_S21CNXAG615134H1: 
UUID="lQrtPt-nx0u-P6Or-f2YW-sN2o-jK9I-gp7P2m" TYPE="LVM2_member"
/dev/mapper/vg_gluster_ssd-lv_gluster_ssd: 
UUID="890feffe-c11b-4c01-b839-a5906ab39ecb" TYPE="vdo"
/dev/mapper/vg_gluster_nvme1-lv_gluster_nvme1: 
UUID="7049fd2a-788d-44cb-9dc5-7b4c0ee309fb" TYPE="vdo"
/dev/mapper/vg_gluster_nvme2-lv_gluster_nvme2: 
UUID="2c541b70-32c5-496e-863f-ea68b50e7671" TYPE="vdo"
/dev/mapper/vdo_gluster_ssd: UUID="e59a68d5-2b73-487a-ac5e-409e11402ab5" 
TYPE="xfs"
/dev/mapper/vdo_gluster_nvme1: UUID="d5f53f17-bca1-4cb9-86d5-34a468c062e7" 
TYPE="xfs"
/dev/mapper/vdo_gluster_nvme2: UUID="40a41b5f-be87-4994-b6ea-793cdfc076a4" 
TYPE="xfs"

#2
/dev/nvme0n1: PTTYPE="dos"
/dev/nvme1n1: PTTYPE="dos"
/dev/mapper/eui.6479a71892882020: PTTYPE="dos"
/dev/mapper/eui.6479a71892882020p1: 
UUID="GiBSqT-JJ3r-Tn3X-lzCr-zW3D-F3IE-OpE4Ga" TYPE="LVM2_member"
/dev/mapper/nvme.126f-324831323230303337383138-4144415441205358383030304e50-0001:
 PTTYPE="dos"
/dev/sda: PTTYPE="gpt"
/dev/mapper/nvme.126f-324831323230303337383138-4144415441205358383030304e50-0001p1:
 UUID="JBhj79-Uk0E-DdLE-Ibof-VwBq-T5nZ-F8d57O" TYPE="LVM2_member"
/dev/sdb: PTTYPE="dos"
/dev/mapper/Samsung_SSD_860_EVO_1TB_S3Z8NB0K843638B: PTTYPE="dos"
/dev/mapper/Samsung_SSD_860_EVO_1TB_S3Z8NB0K843638B1: 
UUID="6yp5YM-D1be-M27p-AEF5-w1pv-uXNF-2vkiJZ" TYPE="LVM2_member"
/dev/mapper/vg_gluster_ssd-lv_gluster_ssd: 
UUID="9643695c-0ace-4cba-a42c-3f337a7d5133" TYPE="vdo"
/dev/mapper/vg_gluster_nvme2-lv_gluster_nvme2: 
UUID="79f5bacc-cbe7-4b67-be05-414f68818f41" TYPE="vdo"
/dev/mapper/vg_gluster_nvme1-lv_gluster_nvme1: 
UUID="2438a550-5fb4-48f4-a5ef-5cff5e7d5ba8" TYPE="vdo"
/dev/mapper/vdo_gluster_ssd: UUID="5bb67f61-9d14-4d0b-8aa4-ae3905276797" 
TYPE="xfs"
/dev/mapper/vdo_gluster_nvme1: UUID="732f939c-f133-4e48-8dc8-c9d21dbc0853" 
TYPE="xfs"
/dev/mapper/vdo_gluster_nvme2: UUID="f55082ca-1269-4477-9bf8-7190f1add9ef" 
TYPE="xfs"

#3
/dev/nvme1n1: UUID="8f1dc44e-f35f-438a-9abc-54757fd7ef32" TYPE="vdo"
/dev/nvme0n1: PTTYPE="dos"
/dev/mapper/nvme.c0a9-313931304531454644323630-4354353030503153534438-0001: 
UUID="8f1dc44e-f35f-438a-9abc-54757fd7ef32" TYPE="vdo"
/dev/mapper/eui.6479a71892882020: PTTYPE="dos"
/dev/mapper/eui.6479a71892882020p1: 
UUID="FwBRJJ-ofHI-1kHq-uEf1-H3Fn-SQcw-qWYvmL" TYPE="LVM2_member"
/dev/sda: PTTYPE="gpt"
/dev/mapper/Samsung_SSD_850_EVO_1TB_S2RENX0J302798A: PTTYPE="gpt"
/dev/mapper/Samsung_SSD_850_EVO_1TB_S2RENX0J302798A1: 
UUID="weCmOq-VZ1a-Itf5-SOIS-AYLp-Ud5N-S1H2bR" TYPE="LVM2_member" 
PARTUUID="920ef5fd-e525-4cf0-99d5-3951d3013c19"
/dev/mapper/vg_gluster_ssd-lv_gluster_ssd: 
UUID="fbaffbde-74f0-4e4a-9564-64ca84398cde" TYPE="vdo"
/dev/mapper/vg_gluster_nvme2-lv_gluster_nvme2: 
UUID="ae0bd2ad-7da9-485b-824a-72038571c5ba" TYPE="vdo"
/dev/mapper/vdo_gluster_ssd: UUID="f0f56784-bc71-46c7-8bfe-6b71327c87c9" 
TYPE="xfs"
/dev/mapper/vdo_gluster_nvme1: UUID="0ddc1180-f228-4209-82f1-1607a46aed1f" 
TYPE="xfs"
/dev/mapper/vdo_gluster_nvme2: UUID="bcb7144a-6ce0-4b3f-9537-f465c46d4843" 
TYPE="xfs"

I don't have any errors on mount until I reboot, and once I reboot it takes 
~6hrs for everything to work 100% since I have to delete the mount commands out 
of stab for the 3 gluster volumes and reboot.  I'da rather wait until the next 
update to do that.

I don't have a variable file or playbook since I made the storage manually, I 
stopped using the playbook since at that point I couldn't enable RDMA or 
over-provision the disks correctly unless I made them manually.  But as I said, 
this is something in 4.3.3 as if I go back to 4.3.2 I can reboot no problem.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/EDGJVIYPMHN5HYARBNCN36NRSTKMSLLW/
  ___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of 

[ovirt-users] Re: oVirt node loses gluster volume UUID after reboot, goes to emergency mode every time I reboot.

2019-05-21 Thread michael
I'm sorry, i'm still working on my linux knowledge, here is the output of my 
blkid on one of the servers:

/dev/nvme0n1: PTTYPE="dos"
/dev/nvme1n1: PTTYPE="dos"
/dev/mapper/eui.6479a71892882020: PTTYPE="dos"
/dev/mapper/eui.0025385881b40f60: PTTYPE="dos"
/dev/mapper/eui.6479a71892882020p1: 
UUID="pfJiP3-HCgP-gCyQ-UIzT-akGk-vRpV-aySGZ2" TYPE="LVM2_member"
/dev/mapper/eui.0025385881b40f60p1: 
UUID="Q0fyzN-9q0s-WDLe-r0IA-MFY0-tose-yzZeu2" TYPE="LVM2_member"

/dev/mapper/Samsung_SSD_850_EVO_1TB_S21CNXAG615134H: PTTYPE="dos"
/dev/mapper/Samsung_SSD_850_EVO_1TB_S21CNXAG615134H1: 
UUID="lQrtPt-nx0u-P6Or-f2YW-sN2o-jK9I-gp7P2m" TYPE="LVM2_member"
/dev/mapper/vg_gluster_ssd-lv_gluster_ssd: 
UUID="890feffe-c11b-4c01-b839-a5906ab39ecb" TYPE="vdo"
/dev/mapper/vg_gluster_nvme1-lv_gluster_nvme1: 
UUID="7049fd2a-788d-44cb-9dc5-7b4c0ee309fb" TYPE="vdo"
/dev/mapper/vg_gluster_nvme2-lv_gluster_nvme2: 
UUID="2c541b70-32c5-496e-863f-ea68b50e7671" TYPE="vdo"
/dev/mapper/vdo_gluster_ssd: UUID="e59a68d5-2b73-487a-ac5e-409e11402ab5" 
TYPE="xfs"
/dev/mapper/vdo_gluster_nvme1: UUID="d5f53f17-bca1-4cb9-86d5-34a468c062e7" 
TYPE="xfs"
/dev/mapper/vdo_gluster_nvme2: UUID="40a41b5f-be87-4994-b6ea-793cdfc076a4" 
TYPE="xfs"

#2
/dev/nvme0n1: PTTYPE="dos"
/dev/nvme1n1: PTTYPE="dos"
/dev/mapper/eui.6479a71892882020: PTTYPE="dos"
/dev/mapper/eui.6479a71892882020p1: 
UUID="GiBSqT-JJ3r-Tn3X-lzCr-zW3D-F3IE-OpE4Ga" TYPE="LVM2_member"
/dev/mapper/nvme.126f-324831323230303337383138-4144415441205358383030304e50-0001:
 PTTYPE="dos"
/dev/sda: PTTYPE="gpt"
/dev/mapper/nvme.126f-324831323230303337383138-4144415441205358383030304e50-0001p1:
 UUID="JBhj79-Uk0E-DdLE-Ibof-VwBq-T5nZ-F8d57O" TYPE="LVM2_member"
/dev/sdb: PTTYPE="dos"
/dev/mapper/Samsung_SSD_860_EVO_1TB_S3Z8NB0K843638B: PTTYPE="dos"
/dev/mapper/Samsung_SSD_860_EVO_1TB_S3Z8NB0K843638B1: 
UUID="6yp5YM-D1be-M27p-AEF5-w1pv-uXNF-2vkiJZ" TYPE="LVM2_member"
/dev/mapper/vg_gluster_ssd-lv_gluster_ssd: 
UUID="9643695c-0ace-4cba-a42c-3f337a7d5133" TYPE="vdo"
/dev/mapper/vg_gluster_nvme2-lv_gluster_nvme2: 
UUID="79f5bacc-cbe7-4b67-be05-414f68818f41" TYPE="vdo"
/dev/mapper/vg_gluster_nvme1-lv_gluster_nvme1: 
UUID="2438a550-5fb4-48f4-a5ef-5cff5e7d5ba8" TYPE="vdo"
/dev/mapper/vdo_gluster_ssd: UUID="5bb67f61-9d14-4d0b-8aa4-ae3905276797" 
TYPE="xfs"
/dev/mapper/vdo_gluster_nvme1: UUID="732f939c-f133-4e48-8dc8-c9d21dbc0853" 
TYPE="xfs"
/dev/mapper/vdo_gluster_nvme2: UUID="f55082ca-1269-4477-9bf8-7190f1add9ef" 
TYPE="xfs"

#3
/dev/nvme1n1: UUID="8f1dc44e-f35f-438a-9abc-54757fd7ef32" TYPE="vdo"
/dev/nvme0n1: PTTYPE="dos"
/dev/mapper/nvme.c0a9-313931304531454644323630-4354353030503153534438-0001: 
UUID="8f1dc44e-f35f-438a-9abc-54757fd7ef32" TYPE="vdo"
/dev/mapper/eui.6479a71892882020: PTTYPE="dos"
/dev/mapper/eui.6479a71892882020p1: 
UUID="FwBRJJ-ofHI-1kHq-uEf1-H3Fn-SQcw-qWYvmL" TYPE="LVM2_member"
/dev/sda: PTTYPE="gpt"
/dev/mapper/Samsung_SSD_850_EVO_1TB_S2RENX0J302798A: PTTYPE="gpt"
/dev/mapper/Samsung_SSD_850_EVO_1TB_S2RENX0J302798A1: 
UUID="weCmOq-VZ1a-Itf5-SOIS-AYLp-Ud5N-S1H2bR" TYPE="LVM2_member" 
PARTUUID="920ef5fd-e525-4cf0-99d5-3951d3013c19"
/dev/mapper/vg_gluster_ssd-lv_gluster_ssd: 
UUID="fbaffbde-74f0-4e4a-9564-64ca84398cde" TYPE="vdo"
/dev/mapper/vg_gluster_nvme2-lv_gluster_nvme2: 
UUID="ae0bd2ad-7da9-485b-824a-72038571c5ba" TYPE="vdo"
/dev/mapper/vdo_gluster_ssd: UUID="f0f56784-bc71-46c7-8bfe-6b71327c87c9" 
TYPE="xfs"
/dev/mapper/vdo_gluster_nvme1: UUID="0ddc1180-f228-4209-82f1-1607a46aed1f" 
TYPE="xfs"
/dev/mapper/vdo_gluster_nvme2: UUID="bcb7144a-6ce0-4b3f-9537-f465c46d4843" 
TYPE="xfs"

I don't have any errors on mount until I reboot, and once I reboot it takes 
~6hrs for everything to work 100% since I have to delete the mount commands out 
of stab for the 3 gluster volumes and reboot.  I'da rather wait until the next 
update to do that.

I don't have a variable file or playbook since I made the storage manually, I 
stopped using the playbook since at that point I couldn't enable RDMA or 
over-provision the disks correctly unless I made them manually.  But as I said, 
this is something in 4.3.3 as if I go back to 4.3.2 I can reboot no problem.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/EDGJVIYPMHN5HYARBNCN36NRSTKMSLLW/


[ovirt-users] Re: Scale out ovirt 4.3 (from 3 to 6 or 9 nodes) with hyperconverged setup and Gluster

2019-05-21 Thread Adrian Quintero
Awesome, thanks! , and yes I agree, this is a great project!

I will now continue to scale the cluster from 3 to 6 nodes including the
storage...I will let y'all know how it goes and post the steps as I have
only seen examples of 3 hosts but not steps to go from 3 to 6.

regards,

AQ


On Tue, May 21, 2019 at 1:06 PM Strahil  wrote:

> > EUREKA: After doing the above I was able to get past the filter issues,
> however I am still concerned if during a reboot the disks might come up
> differently. For example /dev/sdb might come up as /dev/sdx...
>
> Even if they change , you don't have to worry about as each PV contains
> LVM metadata (including  VG configuration) which is read by LVM on boot
> (actually everything that is not in the LVM filter is being scanned like
> that).
> Once all PVs are available  the VG is activated and then the LVs are also
> activated.
>
> > I am trying to make sure this setup is always the same as we want to
> move this to production, however seems I still don't have the full hang of
> it and the RHV 4.1 course is way to old :)
> >
> > Thanks again for helping out with this.
>
> It's a plain KVM with some management layer.
>
> Just a hint:
> Get your HostedEngine's configuration xml from the vdsm log (for
> emergencies) and another copy with reverse boot order  where DVD is booting
> first.
> Also get the xml for the ovirtmgmt network.
>
> It helped me a lot of times  when I wanted to recover my HostedEngine.
> I'm too lazy to rebuild it.
>
> Hint2:
> Vdsm logs contain each VM's configuration xml when the VMs are powered on.
>
> Hint3:
> Get regular backups of the HostedEngine and patch it from time to time.
> I would go in prod as follows:
> Let's say you are on 4.2.8
> Next step would be to go to 4.3.latest and then to 4.4.latest .
>
> A test cluster (even in VMs ) is also benefitial.
>
> Despite the hiccups I have stumbled upon, I think that the project is
> great.
>
> Best Regards,
> Strahil Nikolov
>


-- 
Adrian Quintero
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/NTN2XEMCD2MNU2J5HWZK4HL36LMVUQ6Q/


[ovirt-users] Re: Scale out ovirt 4.3 (from 3 to 6 or 9 nodes) with hyperconverged setup and Gluster

2019-05-21 Thread Strahil
> EUREKA: After doing the above I was able to get past the filter issues, 
> however I am still concerned if during a reboot the disks might come up 
> differently. For example /dev/sdb might come up as /dev/sdx...

Even if they change , you don't have to worry about as each PV contains LVM 
metadata (including  VG configuration) which is read by LVM on boot (actually 
everything that is not in the LVM filter is being scanned like that).
Once all PVs are available  the VG is activated and then the LVs are also 
activated.


> I am trying to make sure this setup is always the same as we want to move 
> this to production, however seems I still don't have the full hang of it and 
> the RHV 4.1 course is way to old :)
>
> Thanks again for helping out with this.


It's a plain KVM with some management layer.

Just a hint:
Get your HostedEngine's configuration xml from the vdsm log (for emergencies) 
and another copy with reverse boot order  where DVD is booting first.
Also get the xml for the ovirtmgmt network.

It helped me a lot of times  when I wanted to recover my HostedEngine.
I'm too lazy to rebuild it.

Hint2:
Vdsm logs contain each VM's configuration xml when the VMs are powered on.

Hint3:
Get regular backups of the HostedEngine and patch it from time to time.
I would go in prod as follows:
Let's say you are on 4.2.8
Next step would be to go to 4.3.latest and then to 4.4.latest .

A test cluster (even in VMs ) is also benefitial.

Despite the hiccups I have stumbled upon, I think that the project is great.

Best Regards,
Strahil Nikolov
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MR4A7I4OOARCEEFUX4LKKW6CT3UGXNUM/


[ovirt-users] Re: Migrating self-HostedEngine from NFS to iSCSI

2019-05-21 Thread Simone Tiraboschi
On Tue, May 21, 2019 at 3:21 PM Miha Verlic  wrote:

> Hello,
>
> I have a few questions regarding migration of HostedEngine. Currently I
> have a cluster of 3 oVirt 4.3.3 nodes, all three of them are capable of
> running HE and I can freely migrate HostedEngine and regular VMs between
> them. However I deployed HostedEngine storage on rather edgy NFS server
> and I would like to migrate it to iSCSI based storage with multipathing.
> Quite a few VMs are already running on cluster and are using iSCSI data
> storage.
>
> Documentation is rather chaotic and fragmented, but from what I gathered
> the path of migration is someting like:
>
> - place one host (#1), the "failover" host, into maintenance mode prior
> to backup
> - export configuration with engine-backup
> - set global maintenance mode on all hosts
>

Hi,
fine till here, then
- copy the backup file on the host you are going to use for the restore
- run something like hosted-engine --deploy
--restore-from-file=/root/engine-backup.tar.gz
- when the tool will ask about HE storage domain, provide the details to
create a new empty one
- once done, connect to the engine, set host 2 into maintenance mode and
reinstall it from the engine choosing to redeploy hosted-engine
- do the same for host 3
- at the end the previous hosted-engine storage domain will be still
visible (but renamed) in the engine; eventually migrate out other VM disks
created there; once ready you can delete it


> - install ovirt engine on that host (#1) (already installed, since this
> is HE capable host)
> - restore engine configuration using engine-backup
> - run engine-setup with new parameters regarding storage
> - after engine-setup, log into admin portal and remove old host (#1)
> - redeploy hosts #2 and #3
>
> Last two steps are a bit confusing as I'm not sure how removing old
> failover host on which new HE is running would work. Also not
> understanding the part where hosts 2 and 3 are described as
> unrecoverable (but with running VMs, which I'd have to live migrate to
> other hosts - how, if they're not operational?).
>
> Few other things:
>
> - Should I first remove & re-add host #1 without HE already deployed on
> host?
>
> - Should I set global maintenance mode on all hosts before migration?
> I'm guessing this is required if I want to prevent HE being started on
> random host during transition...
>
> - Which host should be selected as SPM during the transition phase?
>
> - How can I configure iSCSI multipathing? Self-hosted engine
> documentation mentions Multipath Helper tool, however I cannot find any
> info about it. Is this tool freely available or only a part od RHEL
> subscription?
>
> - Can I configure existing iSCSI Domain which already hosts some VMs as
> HE storage? Or do I have to assign extra LUN/target exclusively for HE?
>
> Cheers
> --
> Miha
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/R2HRFJE5UDRIA5RPEA3NO6UL6B2LAUZF/
>


-- 

Simone Tiraboschi

He / Him / His

Principal Software Engineer

Red Hat 

stira...@redhat.com
@redhatjobs    redhatjobs
 @redhatjobs



___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BRIUJ2LM3LRNNL6QDDLZXTKEA5XIO5BG/


[ovirt-users] Re: Scale out ovirt 4.3 (from 3 to 6 or 9 nodes) with hyperconverged setup and Gluster

2019-05-21 Thread Adrian Quintero
Sac,
*To answer some of your questions:*
*fdisk -l:*
[root@host1 ~]# fdisk -l /dev/sdb
Disk /dev/sde: 480.1 GB, 480070426624 bytes, 937637552 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 262144 bytes / 262144 bytes

[root@host1 ~]# fdisk -l /dev/sdc

Disk /dev/sdc: 3000.6 GB, 3000559427584 bytes, 5860467632 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 262144 bytes / 262144 bytes

[root@host1 ~]# fdisk -l /dev/sdd

Disk /dev/sdd: 3000.6 GB, 3000559427584 bytes, 5860467632 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 262144 bytes / 262144 bytes



*1) i did  wipefs to all /dev/sdb,c,d,e*
*2) I did not zero out the disks as I had done it thru the controller.*

*3) cat /proc/partitions:*
[root@host1 ~]# cat /proc/partitions
major minor  #blocks  name

   80  586029016 sda
   811048576 sda1
   82  584978432 sda2
   8   16 2930233816 sdb
   8   32 2930233816 sdc
   8   48 2930233816 sdd
   8   64  468818776 sde



*4) grep filter /etc/lvm/lvm.conf (I did not modify the  lvm.conf file)*
[root@host1 ~]# grep "filter =" /etc/lvm/lvm.conf
# filter = [ "a|.*/|" ]
# filter = [ "r|/dev/cdrom|" ]
# filter = [ "a|loop|", "r|.*|" ]
# filter = [ "a|loop|", "r|/dev/hdc|", "a|/dev/ide|", "r|.*|" ]
# filter = [ "a|^/dev/hda8$|", "r|.*/|" ]
# filter = [ "a|.*/|" ]
# global_filter = [ "a|.*/|" ]
# mlock_filter = [ "locale/locale-archive", "gconv/gconv-modules.cache" ]




*What I did to get it working:*

I re-installed my first 3 hosts using
"ovirt-node-ng-installer-4.3.3-2019041712.el7.iso"and made sure I zeroed
the disks from within the controller, then I performed the following steps:

1.- modifed the blacklist section on /etc/multipath.conf to this:
blacklist {
 #   protocol "(scsi:adt|scsi:sbp)"
devnode "*"
}
2.-Made sure the second line of /etc/multipath.conf has:
# VDSM PRIVATE
3.-Increased /var/log to 15GB
4.-Rebuilt initramfs, rebooted
5.-wipefs -a /dev/sdb /dev/sdc /dev/sdd /dev/sde
6.-started the hyperconverged setup wizard and added*
"gluster_features_force_varlogsizecheck: false"* to the "vars:" section on
the  Generated Ansible inventory : */etc/ansible/hc_wizard_inventory.yml*
file as it was complaining about /var/log messages LV.

*EUREKA: *After doing the above I was able to get past the filter issues,
however I am still concerned if during a reboot the disks might come up
differently. For example /dev/sdb might come up as /dev/sdx...


I am trying to make sure this setup is always the same as we want to move
this to production, however seems I still don't have the full hang of it
and the RHV 4.1 course is way to old :)

Thanks again for helping out with this.



-AQ




On Tue, May 21, 2019 at 3:29 AM Sachidananda URS  wrote:

>
>
> On Tue, May 21, 2019 at 12:16 PM Sahina Bose  wrote:
>
>>
>>
>> On Mon, May 20, 2019 at 9:55 PM Adrian Quintero 
>> wrote:
>>
>>> Sahina,
>>> Yesterday I started with a fresh install, I completely wiped clean all
>>> the disks, recreated the arrays from within my controller of our DL380 Gen
>>> 9's.
>>>
>>> OS: RAID 1 (2x600GB HDDs): /dev/sda// Using ovirt node 4.3.3.1 iso.
>>> engine and VMSTORE1: JBOD (1x3TB HDD):/dev/sdb
>>> DATA1: JBOD (1x3TB HDD): /dev/sdc
>>> DATA2: JBOD (1x3TB HDD): /dev/sdd
>>> Caching disk: JOBD (1x440GB SDD): /dev/sde
>>>
>>> *After the OS install on the first 3 servers and setting up ssh keys,  I
>>> started the Hyperconverged deploy process:*
>>> 1.-Logged int to the first server http://host1.example.com:9090
>>> 2.-Selected Hyperconverged, clicked on "Run Gluster Wizard"
>>> 3.-Followed the wizard steps (Hosts, FQDNs, Packages, Volumes, Bricks,
>>> Review)
>>> *Hosts/FQDNs:*
>>> host1.example.com
>>> host2.example.com
>>> host3.example.com
>>> *Packages:*
>>> *Volumes:*
>>> engine:replicate:/gluster_bricks/engine/engine
>>> vmstore1:replicate:/gluster_bricks/vmstore1/vmstore1
>>> data1:replicate:/gluster_bricks/data1/data1
>>> data2:replicate:/gluster_bricks/data2/data2
>>> *Bricks:*
>>> engine:/dev/sdb:100GB:/gluster_bricks/engine
>>> vmstore1:/dev/sdb:2600GB:/gluster_bricks/vmstrore1
>>> data1:/dev/sdc:2700GB:/gluster_bricks/data1
>>> data2:/dev/sdd:2700GB:/gluster_bricks/data2
>>> LV Cache:
>>> /dev/sde:400GB:writethrough
>>> 4.-After I hit deploy on the last step of the "Wizard" that is when I
>>> get the disk filter error.
>>> TASK [gluster.infra/roles/backend_setup : Create volume groups]
>>> 
>>> failed: [vmm10.virt.iad3p] (item={u'vgname': u'gluster_vg_sdb',
>>> u'pvname': u'/dev/sdb'}) => {"changed": false, "err": "  Device /dev/sdb
>>> excluded by a filter.\n", "item": {"pvname": "/dev/sdb", "vgname":
>>> "gluster_vg_sdb"}, "msg": "Creating physical volume '/dev/sdb' failed",
>>> "rc": 5}
>>> failed: [vmm12.virt.iad3p] 

[ovirt-users] Re: Scale out ovirt 4.3 (from 3 to 6 or 9 nodes) with hyperconverged setup and Gluster

2019-05-21 Thread Strahil
Thanks for the clarification.
It seems that my nvme (used by vdo) is not locked.
I will check again before opening a bug.

Best Regards,
Strahil NikolovOn May 21, 2019 09:52, Sahina Bose  wrote:
>
>
>
> On Tue, May 21, 2019 at 2:36 AM Strahil Nikolov  wrote:
>>
>> Hey Sahina,
>>
>> it seems that almost all of my devices are locked - just like Fred's.
>> What exactly does it mean - I don't have any issues with my bricks/storage 
>> domains.
>
>
>
> If the devices show up as locked - it means the disk cannot be used to create 
> a brick. This is when the disk either already has a filesystem or is in use.
> But if the device is a clean device and it still shows up as locked - this 
> could be a bug with how python-blivet/ vdsm reads this
>
> The code to check is implemented as
> _canCreateBrick(device):
>     if not device or device.kids > 0 or device.format.type or \
>        hasattr(device.format, 'mountpoint') or \
>        device.type in ['cdrom', 'lvmvg', 'lvmthinpool', 'lvmlv', 'lvmthinlv']:
>         return False
>     return True
>
>>
>> Best Regards,
>> Strahil Nikolov
>>
>> В понеделник, 20 май 2019 г., 14:56:11 ч. Гринуич+3, Sahina Bose 
>>  написа:
>>
>>
>> To scale existing volumes - you need to add bricks and run rebalance on the 
>> gluster volume so that data is correctly redistributed as Alex mentioned.
>> We do support expanding existing volumes as the bug 
>> https://bugzilla.redhat.com/show_bug.cgi?id=1471031 has been fixed
>>
>> As to procedure to expand volumes:
>> 1. Create bricks from UI - select Host -> Storage Device -> Storage device. 
>> Click on "Create Brick"
>> If the device is shown as locked, make sure there's no signature on device.  
>> If multipath entries have been created for local devices, you can blacklist 
>> those devices in multipath.conf and restart multipath.
>> (If you see device as locked even after you do this -please report back).
>> 2. Expand volume using Volume -> Bricks -> Add Bricks, and select the 3 
>> bricks created in previous step
>> 3. Run Rebalance on the volume. Volume -> Rebalance.
>>
>>
>> On Thu, May 16, 2019 at 2:48 PM Fred Rolland  wrote:
>>>
>>> Sahina,
>>> Can someone from your team review the steps done by Adrian?
>>> Thanks,
>>> Freddy
>>>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/LHPXUDT3ZSUYNIH54QOTNMUEGYZGSCTM/


[ovirt-users] Re: [ovirt-announce] Re: [ANN] oVirt 4.3.4 First Release Candidate is now available

2019-05-21 Thread Strahil
Dear Krutika,

Yes I did but I use 6 ports (1 gbit/s each) and this is the reason that reads 
get slower.
Do you know a way to force gluster to open more connections (client to server & 
server to server)?

Thanks for the detailed explanation.

Best Regards,
Strahil NikolovOn May 21, 2019 08:36, Krutika Dhananjay  
wrote:
>
> So in our internal tests (with nvme ssd drives, 10g n/w), we found read 
> performance to be better with choose-local 
> disabled in hyperconverged setup.  See 
> https://bugzilla.redhat.com/show_bug.cgi?id=1566386 for more information.
>
> With choose-local off, the read replica is chosen randomly (based on hash 
> value of the gfid of that shard).
> And when it is enabled, the reads always go to the local replica.
> We attributed better performance with the option disabled to bottlenecks in 
> gluster's rpc/socket layer. Imagine all read
> requests lined up to be sent over the same mount-to-brick connection as 
> opposed to (nearly) randomly getting distributed
> over three (because replica count = 3) such connections. 
>
> Did you run any tests that indicate "choose-local=on" is giving better read 
> perf as opposed to when it's disabled?
>
> -Krutika
>
> On Sun, May 19, 2019 at 5:11 PM Strahil Nikolov  wrote:
>>
>> Ok,
>>
>> so it seems that Darell's case and mine are different as I use vdo.
>>
>> Now I have destroyed Storage Domains, gluster volumes and vdo and recreated 
>> again (4 gluster volumes on a single vdo).
>> This time vdo has '--emulate512=true' and no issues have been observed.
>>
>> Gluster volume options before 'Optimize for virt':
>>
>> Volume Name: data_fast
>> Type: Replicate
>> Volume ID: 378804bf-2975-44d8-84c2-b541aa87f9ef
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: gluster1:/gluster_bricks/data_fast/data_fast
>> Brick2: gluster2:/gluster_bricks/data_fast/data_fast
>> Brick3: ovirt3:/gluster_bricks/data_fast/data_fast (arbiter)
>> Options Reconfigured:
>> transport.address-family: inet
>> nfs.disable: on
>> performance.client-io-threads: off
>> cluster.enable-shared-storage: enable
>>
>> Gluster volume after 'Optimize for virt':
>>
>> Volume Name: data_fast
>> Type: Replicate
>> Volume ID: 378804bf-2975-44d8-84c2-b541aa87f9ef
>> Status: Stopped
>> Snapshot Count: 0
>> Number of Bricks: 1 x (2 + 1) = 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: gluster1:/gluster_bricks/data_fast/data_fast
>> Brick2: gluster2:/gluster_bricks/data_fast/data_fast
>> Brick3: ovirt3:/gluster_bricks/data_fast/data_fast (arbiter)
>> Options Reconfigured:
>> network.ping-timeout: 30
>> performance.strict-o-direct: on
>> storage.owner-gid: 36
>> storage.owner-uid: 36
>> server.event-threads: 4
>> client.event-threads: 4
>> cluster.choose-local: off
>> user.cifs: off
>> features.shard: on
>> cluster.shd-wait-qlength: 1
>> cluster.shd-max-threads: 8
>> cluster.locking-scheme: granular
>> cluster.data-self-heal-algorithm: full
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> cluster.eager-lock: enable
>> network.remote-dio: off
>> performance.low-prio-threads: 32
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> transport.address-family: inet
>> nfs.disable: on
>> performance.client-io-threads: on
>> cluster.enable-shared-storage: enable
>>
>> After that adding the volumes as storage domains (via UI) worked without any 
>> issues.
>>
>> Can someone clarify why we have now 'cluster.choose-local: off' when in 
>> oVirt 4.2.7 (gluster v3.12.15) we didn't have that ?
>> I'm using storage that is faster than network and reading from local brick 
>> gives very high read speed.
>>
>> Best Regards,
>> Strahil Nikolov
>>
>>
>>
>> В неделя, 19 май 2019 г., 9:47:27 ч. Г�___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/FEDHZUJUB5ODQ34ME4BZP2L73KYUU5CH/


[ovirt-users] Re: VM Windows on 4.2

2019-05-21 Thread MIMMIK _
I always get "No new device drivers found" when I try to select any amd64 
folder in the virtio-win ISO (stable version) from the Windows10 x64 (no matter 
if home/pro/enterprise).
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PYIUIPWP5OA3EQO3RGIOCCZXUS2REGYG/


[ovirt-users] Migrating self-HostedEngine from NFS to iSCSI

2019-05-21 Thread Miha Verlic
Hello,

I have a few questions regarding migration of HostedEngine. Currently I
have a cluster of 3 oVirt 4.3.3 nodes, all three of them are capable of
running HE and I can freely migrate HostedEngine and regular VMs between
them. However I deployed HostedEngine storage on rather edgy NFS server
and I would like to migrate it to iSCSI based storage with multipathing.
Quite a few VMs are already running on cluster and are using iSCSI data
storage.

Documentation is rather chaotic and fragmented, but from what I gathered
the path of migration is someting like:

- place one host (#1), the "failover" host, into maintenance mode prior
to backup
- export configuration with engine-backup
- set global maintenance mode on all hosts
- install ovirt engine on that host (#1) (already installed, since this
is HE capable host)
- restore engine configuration using engine-backup
- run engine-setup with new parameters regarding storage
- after engine-setup, log into admin portal and remove old host (#1)
- redeploy hosts #2 and #3

Last two steps are a bit confusing as I'm not sure how removing old
failover host on which new HE is running would work. Also not
understanding the part where hosts 2 and 3 are described as
unrecoverable (but with running VMs, which I'd have to live migrate to
other hosts - how, if they're not operational?).

Few other things:

- Should I first remove & re-add host #1 without HE already deployed on
host?

- Should I set global maintenance mode on all hosts before migration?
I'm guessing this is required if I want to prevent HE being started on
random host during transition...

- Which host should be selected as SPM during the transition phase?

- How can I configure iSCSI multipathing? Self-hosted engine
documentation mentions Multipath Helper tool, however I cannot find any
info about it. Is this tool freely available or only a part od RHEL
subscription?

- Can I configure existing iSCSI Domain which already hosts some VMs as
HE storage? Or do I have to assign extra LUN/target exclusively for HE?

Cheers
-- 
Miha
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/R2HRFJE5UDRIA5RPEA3NO6UL6B2LAUZF/


[ovirt-users] hosts becomes NonResponsive

2019-05-21 Thread Jiří Sléžka
Hi,

time to time one of our four ovirt hosts become NonResponsive.

From engine point of view it looks this way (engine.log)

2019-05-21 13:10:30,261+02 ERROR
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(EE-ManagedThreadFactory-engineScheduled-Thread-95) [] EVENT_ID:
VDS_BROKER_COMMAND_FAILURE(10,802), VDSM ovirt03.net.slu.cz command Get
Host Capabilities failed: Message timeout which can be caused by
communication issues
2019-05-21 13:10:30,261+02 ERROR
[org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring]
(EE-ManagedThreadFactory-engineScheduled-Thread-95) [] Unable to
RefreshCapabilities: VDSNetworkException: VDSGenericException:
VDSNetworkException: Message timeout which can be caused by
communication issues

from host (which is reachable) it looks like (vdsm.log)

2019-05-21 13:10:27,154+0200 INFO  (vmrecovery) [vdsm.api] START
getConnectedStoragePoolsList(options=None) from=internal,
task_id=a1bebf2f-7070-4344-90b7-1d709ba94b5c (api:48)
2019-05-21 13:10:27,154+0200 INFO  (vmrecovery) [vdsm.api] FINISH
getConnectedStoragePoolsList return={'poollist': []} from=internal,
task_id=a1bebf2f-7070-4344-90b7-1d709ba94b5c (api:54)
2019-05-21 13:10:27,155+0200 INFO  (vmrecovery) [vds] recovery: waiting
for storage pool to go up (clientIF:709)
2019-05-21 13:10:31,245+0200 INFO  (jsonrpc/4) [api.host] START
getAllVmStats() from=::1,39144 (api:48)
2019-05-21 13:10:31,247+0200 INFO  (jsonrpc/4) [api.host] FINISH
getAllVmStats return={'status': {'message': 'Done', 'code': 0},
'statsList': (suppressed)} from=::1,39144 (api:54)
2019-05-21 13:10:31,249+0200 INFO  (jsonrpc/4) [jsonrpc.JsonRpcServer]
RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:312)


hosts are latest CentOS7 (but old AMD Opteron HW), oVirt is 4.3.3.7-1.el7

I cannot track it down to network layer. We have 4 other RHV hosts on
the same infrastructure and it works well. Some clues what is happening?

Thanks in advance,

Jiri Slezka



smime.p7s
Description: Elektronicky podpis S/MIME
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/C7CNMZR75CRUXD4JPUT2YG5WFKDRPSDI/


[ovirt-users] Re: Wrong disk size in UI after expanding iscsi direct LUN

2019-05-21 Thread Bernhard Dick

Hi Nir,

Am 18.05.2019 um 20:48 schrieb Nir Soffer:
On Thu, May 16, 2019 at 6:10 PM Bernhard Dick > wrote:


Hi,

I've extended the size of one of my direct iSCSI LUNs. The VM is seeing
the new size but in the webinterface there is still the old size
reported. Is there a way to update this information? I already took a
look into the list but there are only reports regarding updating the
size the VM sees.


Sounds like you hit this bug:
https://bugzilla.redhat.com/1651939 



The description mention a workaround using the REST API.

thanks, the workaround using the REST API helped.

  Bernhard


Nir


    Best regards
      Bernhard
___
Users mailing list -- users@ovirt.org 
To unsubscribe send an email to users-le...@ovirt.org

Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/54YHISUA66227IAMI2UVPZRIXV54BAKA/


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/S74IJSVBYB3RVS7RBXB75XQHLPMMUPIC/


[ovirt-users] Re: Wrong disk size in UI after expanding iscsi direct LUN

2019-05-21 Thread Bernhard Dick

Hi,

Am 17.05.2019 um 19:25 schrieb Scott Dickerson:



On Thu, May 16, 2019 at 11:11 AM Bernhard Dick > wrote:


Hi,

I've extended the size of one of my direct iSCSI LUNs. The VM is seeing
the new size but in the webinterface there is still the old size
reported. Is there a way to update this information? I already took a
look into the list but there are only reports regarding updating the
size the VM sees.


What ovirt version?  Which webinterface and view are you checking, Admin 
Portal or VM Portal?

I'm using the Admin Portal. And the version is 4.3.3.7-1.el7.

  Best Regards
Bernhard



    Best regards
      Bernhard
___
Users mailing list -- users@ovirt.org 
To unsubscribe send an email to users-le...@ovirt.org

Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/
List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/54YHISUA66227IAMI2UVPZRIXV54BAKA/



--
Scott Dickerson
Senior Software Engineer
RHV-M Engineering - UX Team
Red Hat, Inc

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MCY2SDB2N4RNALDMJUMSAU7GHL7MISEX/


[ovirt-users] Re: VM Windows on 4.2

2019-05-21 Thread MIMMIK _
I can't boot UEFI Windows VM even on oVirt 4.3.

"No boot device found"
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7DOCEPCTATLURXQWGCU2FZVK4E2E3QPJ/


[ovirt-users] Re: RHEL 8 Template Seal failed

2019-05-21 Thread Liran Rotenberg
I'll copy Michal's answer from another thread:
"Indeed it won’t work for el8 guests. Unfortunately an el7
libguestfs(so any virt- tool) can’t work with el8 filesystems. And
ovirt still uses el7 hosts
That’s going to be a limitation until 4.4

Thanks,
michal"

If you wish to seal el8 guest, you might seal it manually and create a
template from it.
Liran.

On Fri, May 17, 2019 at 7:46 AM Vinícius Ferrão
 wrote:
>
> Hello,
>
> I’m trying to seal a RHEL8 template but the operation is failing.
>
> Here’s the relevant information from engine.log:
>
> 2019-05-17 01:30:31,153-03 INFO  
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetHostJobsVDSCommand] 
> (EE-ManagedThreadFactory-engineScheduled-Thread-58) 
> [91e1acd6-efc5-411b-8c76-970def4ebbbe] FINISH, GetHostJobsVDSCommand, return: 
> {b80c0bbd-25b8-4007-9b91-376cb0a18e30=HostJobInfo:{id='b80c0bbd-25b8-4007-9b91-376cb0a18e30',
>  type='virt', description='seal_vm', status='failed', progress='null', 
> error='VDSError:{code='GeneralException', message='General Exception: 
> ('Command [\'/usr/bin/virt-sysprep\', \'-a\', 
> u\'/rhev/data-center/mnt/192.168.10.6:_mnt_pool0_ovirt_vm/d19456e4-0051-456e-b33c-57348a78c2e0/images/1ecdfbfc-1c22-452f-9a53-2159701549c8/f9de3eae-f475-451b-b587-f6a1405036e8\']
>  failed with rc=1 out=\'[   0.0] Examining the guest ...\\nvirt-sysprep: 
> warning: mount_options: mount exited with status 32: mount: \\nwrong fs type, 
> bad option, bad superblock on /dev/mapper/rhel_rhel8-root,\\n   missing 
> codepage or helper program, or other error\\n\\n   In some cases useful 
> info is found in syslog - try\\n   dmesg | tail or so. 
> (ignored)\\nvirt-sysprep: warning: mount_options: mount: /boot: mount point 
> is not a \\ndirectory (ignored)\\nvirt-sysprep: warning: mount_options: 
> mount: /boot/efi: mount point is not \\na directory (ignored)\\n[  17.9] 
> Performing "abrt-data" ...\\n\' err="virt-sysprep: error: libguestfs error: 
> glob_expand: glob_expand_stub: you \\nmust call \'mount\' first to mount the 
> root filesystem\\n\\nIf reporting bugs, run virt-sysprep with debugging 
> enabled and include the \\ncomplete output:\\n\\n  virt-sysprep -v -x 
> [...]\\n"',)'}'}}, log id: 1bbb34bf
>
> I’m not shure what’s wrong or missing. The VM image is using UEFI with Secure 
> Boot, so standard UEFI partition is in place.
>
> Ive found something on bugzilla but does not seem to be related:
> https://bugzilla.redhat.com/show_bug.cgi?id=1671895
>
> Thanks,
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/WILFBK6SOTKJP25PAS4JODNNOUFW7HUQ/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WPFWL3U3OORLXYH37EEYQES6RK6L463V/


[ovirt-users] Re: VM pools broken in 4.3

2019-05-21 Thread Lucie Leistnerova

Hi Rik,

I tried also USB enabled pool and other combinations. And unfortunately 
I did not reproduce the problem.


Maybe Michal can say where to look further.

On 5/21/19 9:29 AM, Rik Theys wrote:


Hi,

I've now created a new pool without USB support. After creating the 
pool, I've restarted ovirt-engine as I could not start the VM's from 
the pool (it indicated a similar request was already running).


When the ovirt-engine was restarted I logged into the VM portal and 
was able to launch a VM from the new pool. Once booted, I powered it 
down (from within the VM). Once the VM portal UI indicated the VM was 
down, I clicked run again to launch a new instance from the pool. The 
same error as before comes up that there is no VM available (which is 
incorrect as the pool is larger than 1 VM and no VM's are running at 
that point).


In the log, the below errors and warnings are logged (stripped the 
INFO). It seems to try and release a lock which does not exist, or is 
referencing it with the wrong id. Is there a way to trace which locks 
are currently being held? Are they stored persistently somewhere that 
may be causing my issues?


Regards,

Rik

2019-05-21 09:16:39,342+02 ERROR 
[org.ovirt.engine.core.bll.GetPermissionsForObjectQuery] (default 
task-2) [5e16cd77-01c8-43ed-80d2-c85452732570] Query execution failed 
due to insufficient permissions.
2019-05-21 09:16:39,345+02 ERROR 
[org.ovirt.engine.api.restapi.resource.AbstractBackendResource] 
(default task-2) [] Operation Failed: query execution failed due to 
insufficient permissions.

  testpool-1
  6489cc72-8ca5-4943-901c-bbd405bdac68
  8388608
  8388608
  33554432
  16
  
    
  oVirt
  OS-NAME:
  OS-VERSION:
  HOST-SERIAL:
  6489cc72-8ca5-4943-901c-bbd405bdac68
    
  
  
    
    
    
  
  
    
  
  
    Haswell-noTSX
    
    
    
    
  
    
  
  
  
    
    
  
  path="/var/lib/libvirt/qemu/channels/6489cc72-8ca5-4943-901c-bbd405bdac68.ovirt-guest-agent.0"/>

    
    
  
  path="/var/lib/libvirt/qemu/channels/6489cc72-8ca5-4943-901c-bbd405bdac68.org.qemu.guest_agent.0"/>

    
    
  
  
  type="pci"/>

    
    
  
  type="pci"/>

    
    
  
  type="pci"/>

    
    
  
  type="pci"/>

    
    passwdValidTo="1970-01-01T00:00:01" keymap="en-us">

  
    
    
  type="pci"/>

    
    
  type="pci"/>

    
    
  vgamem="16384"/>

  
  type="pci"/>

    
    passwdValidTo="1970-01-01T00:00:01" tlsPort="-1">

  
  
  
  
  
  
  
  
  
    
    
  /dev/urandom
  
    
    
  
    
    
  
  
  
  
  
  type="pci"/>

  
  
  
  
    
    
  
  
    
  
  
  
  
  
    
    
  
  dev="/rhev/data-center/mnt/blockSD/4194c70d-5b7e-441f-af6b-7d8754e89572/images/9651890f-9b0c-4857-abae-77b8b543a897/8690340a-8d4d-4d04-a8ab-b18fe6cbb78b">

    
  
  cache="none"/>

  
  
  
9651890f-9b0c-4857-abae-77b8b543a897
    
  
  
    
    
  
  
    hvm
    
  
  
    
    
  type="int">4096

4.3
  
  
    
  
  
5982eba0-03b9-0281-0363-037a
8690340a-8d4d-4d04-a8ab-b18fe6cbb78b
9651890f-9b0c-4857-abae-77b8b543a897
4194c70d-5b7e-441f-af6b-7d8754e89572
  
false
auto_resume
    
  

2019-05-21 09:16:44,399+02 WARN 
[org.ovirt.engine.core.bll.lock.InMemoryLockManager] (default task-9) 
[8a9f8c3f-e441-4121-aa7f-5b2cf26da6bb] Trying to release exclusive 
lock which does not exist, lock key: 'a

5bed59c-d2fe-4fe4-bff7-52efe089ebd6USER_VM_POOL'
  testpool-1
  6489cc72-8ca5-4943-901c-bbd405bdac68
  http://ovirt.org/vm/tune/1.0; 
xmlns:ovirt-vm="http://ovirt.org/vm/1.0;>

    
    http://ovirt.org/vm/1.0;>
4.3
    type="bool">False

false
    type="int">4096
    type="int">4096

auto_resume
    1558423004.25
    
ea-students
    
    4
    
    
    
4194c70d-5b7e-441f-af6b-7d8754e89572
9651890f-9b0c-4857-abae-77b8b543a897
5982eba0-03b9-0281-0363-037a
8690340a-8d4d-4d04-a8ab-b18fe6cbb78b
    
    
4194c70d-5b7e-441f-af6b-7d8754e89572
9651890f-9b0c-4857-abae-77b8b543a897
    type="int">119537664

/dev/4194c70d-5b7e-441f-af6b-7d8754e89572/leases
/rhev/data-center/mnt/blockSD/4194c70d-5b7e-441f-af6b-7d8754e89572/images/9651890f-9b0c-4857-abae-77b8b543a897/5f606a1e-3377-45b7-91d0-6398f7694c45
5f606a1e-3377-45b7-91d0-6398f7694c45
    
    
4194c70d-5b7e-441f-af6b-7d8754e89572
9651890f-9b0c-4857-abae-77b8b543a897
    type="int">108003328

/dev/4194c70d-5b7e-441f-af6b-7d8754e89572/leases
/rhev/data-center/mnt/blockSD/4194c70d-5b7e-441f-af6b-7d8754e89572/images/9651890f-9b0c-4857-abae-77b8b543a897/8690340a-8d4d-4d04-a8ab-b18fe6cbb78b
8690340a-8d4d-4d04-a8ab-b18fe6cbb78b
    
    
    
    

  
  33554432
  8388608
  8388608
  16
  
    /machine
  
  
    
  oVirt
  oVirt Node
  

[ovirt-users] Re: Scale out ovirt 4.3 (from 3 to 6 or 9 nodes) with hyperconverged setup and Gluster

2019-05-21 Thread Sachidananda URS
On Tue, May 21, 2019 at 12:16 PM Sahina Bose  wrote:

>
>
> On Mon, May 20, 2019 at 9:55 PM Adrian Quintero 
> wrote:
>
>> Sahina,
>> Yesterday I started with a fresh install, I completely wiped clean all
>> the disks, recreated the arrays from within my controller of our DL380 Gen
>> 9's.
>>
>> OS: RAID 1 (2x600GB HDDs): /dev/sda// Using ovirt node 4.3.3.1 iso.
>> engine and VMSTORE1: JBOD (1x3TB HDD):/dev/sdb
>> DATA1: JBOD (1x3TB HDD): /dev/sdc
>> DATA2: JBOD (1x3TB HDD): /dev/sdd
>> Caching disk: JOBD (1x440GB SDD): /dev/sde
>>
>> *After the OS install on the first 3 servers and setting up ssh keys,  I
>> started the Hyperconverged deploy process:*
>> 1.-Logged int to the first server http://host1.example.com:9090
>> 2.-Selected Hyperconverged, clicked on "Run Gluster Wizard"
>> 3.-Followed the wizard steps (Hosts, FQDNs, Packages, Volumes, Bricks,
>> Review)
>> *Hosts/FQDNs:*
>> host1.example.com
>> host2.example.com
>> host3.example.com
>> *Packages:*
>> *Volumes:*
>> engine:replicate:/gluster_bricks/engine/engine
>> vmstore1:replicate:/gluster_bricks/vmstore1/vmstore1
>> data1:replicate:/gluster_bricks/data1/data1
>> data2:replicate:/gluster_bricks/data2/data2
>> *Bricks:*
>> engine:/dev/sdb:100GB:/gluster_bricks/engine
>> vmstore1:/dev/sdb:2600GB:/gluster_bricks/vmstrore1
>> data1:/dev/sdc:2700GB:/gluster_bricks/data1
>> data2:/dev/sdd:2700GB:/gluster_bricks/data2
>> LV Cache:
>> /dev/sde:400GB:writethrough
>> 4.-After I hit deploy on the last step of the "Wizard" that is when I get
>> the disk filter error.
>> TASK [gluster.infra/roles/backend_setup : Create volume groups]
>> 
>> failed: [vmm10.virt.iad3p] (item={u'vgname': u'gluster_vg_sdb',
>> u'pvname': u'/dev/sdb'}) => {"changed": false, "err": "  Device /dev/sdb
>> excluded by a filter.\n", "item": {"pvname": "/dev/sdb", "vgname":
>> "gluster_vg_sdb"}, "msg": "Creating physical volume '/dev/sdb' failed",
>> "rc": 5}
>> failed: [vmm12.virt.iad3p] (item={u'vgname': u'gluster_vg_sdb',
>> u'pvname': u'/dev/sdb'}) => {"changed": false, "err": "  Device /dev/sdb
>> excluded by a filter.\n", "item": {"pvname": "/dev/sdb", "vgname":
>> "gluster_vg_sdb"}, "msg": "Creating physical volume '/dev/sdb' failed",
>> "rc": 5}
>> failed: [vmm11.virt.iad3p] (item={u'vgname': u'gluster_vg_sdb',
>> u'pvname': u'/dev/sdb'}) => {"changed": false, "err": "  Device /dev/sdb
>> excluded by a filter.\n", "item": {"pvname": "/dev/sdb", "vgname":
>> "gluster_vg_sdb"}, "msg": "Creating physical volume '/dev/sdb' failed",
>> "rc": 5}
>> failed: [vmm12.virt.iad3p] (item={u'vgname': u'gluster_vg_sdc',
>> u'pvname': u'/dev/sdc'}) => {"changed": false, "err": "  Device /dev/sdc
>> excluded by a filter.\n", "item": {"pvname": "/dev/sdc", "vgname":
>> "gluster_vg_sdc"}, "msg": "Creating physical volume '/dev/sdc' failed",
>> "rc": 5}
>> failed: [vmm10.virt.iad3p] (item={u'vgname': u'gluster_vg_sdc',
>> u'pvname': u'/dev/sdc'}) => {"changed": false, "err": "  Device /dev/sdc
>> excluded by a filter.\n", "item": {"pvname": "/dev/sdc", "vgname":
>> "gluster_vg_sdc"}, "msg": "Creating physical volume '/dev/sdc' failed",
>> "rc": 5}
>> failed: [vmm11.virt.iad3p] (item={u'vgname': u'gluster_vg_sdc',
>> u'pvname': u'/dev/sdc'}) => {"changed": false, "err": "  Device /dev/sdc
>> excluded by a filter.\n", "item": {"pvname": "/dev/sdc", "vgname":
>> "gluster_vg_sdc"}, "msg": "Creating physical volume '/dev/sdc' failed",
>> "rc": 5}
>> failed: [vmm10.virt.iad3p] (item={u'vgname': u'gluster_vg_sdd',
>> u'pvname': u'/dev/sdd'}) => {"changed": false, "err": "  Device /dev/sdd
>> excluded by a filter.\n", "item": {"pvname": "/dev/sdd", "vgname":
>> "gluster_vg_sdd"}, "msg": "Creating physical volume '/dev/sdd' failed",
>> "rc": 5}
>> failed: [vmm12.virt.iad3p] (item={u'vgname': u'gluster_vg_sdd',
>> u'pvname': u'/dev/sdd'}) => {"changed": false, "err": "  Device /dev/sdd
>> excluded by a filter.\n", "item": {"pvname": "/dev/sdd", "vgname":
>> "gluster_vg_sdd"}, "msg": "Creating physical volume '/dev/sdd' failed",
>> "rc": 5}
>> failed: [vmm11.virt.iad3p] (item={u'vgname': u'gluster_vg_sdd',
>> u'pvname': u'/dev/sdd'}) => {"changed": false, "err": "  Device /dev/sdd
>> excluded by a filter.\n", "item": {"pvname": "/dev/sdd", "vgname":
>> "gluster_vg_sdd"}, "msg": "Creating physical volume '/dev/sdd' failed",
>> "rc": 5}
>>
>> Attached is the generated yml file ( /etc/ansible/hc_wizard_inventory.yml)
>> and the "Deployment Failed" file
>>
>>
>>
>>
>> Also wondering if I hit this bug?
>> https://bugzilla.redhat.com/show_bug.cgi?id=1635614
>>
>>
> +Sachidananda URS  +Gobinda Das  to
> review the inventory file and failures
>

Hello Adrian,

Can you please provide the output of:
# fdisk -l /dev/sdd
# fdisk -l /dev/sdb

I think there could be stale signature on the disk causing this error.
Some of the possible solutions to try:
1)
# wipefs -a /dev/sdb
# wipefs -a /dev/sdd

2)
You can zero out first few sectors of disk by:

# dd if=/dev/zero of=/dev/sdb bs=1M count=10

3)
Check 

[ovirt-users] Re: Scale out ovirt 4.3 (from 3 to 6 or 9 nodes) with hyperconverged setup and Gluster

2019-05-21 Thread Sahina Bose
On Tue, May 21, 2019 at 2:36 AM Strahil Nikolov 
wrote:

> Hey Sahina,
>
> it seems that almost all of my devices are locked - just like Fred's.
> What exactly does it mean - I don't have any issues with my bricks/storage
> domains.
>


If the devices show up as locked - it means the disk cannot be used to
create a brick. This is when the disk either already has a filesystem or is
in use.
But if the device is a clean device and it still shows up as locked - this
could be a bug with how python-blivet/ vdsm reads this

The code to check is implemented as
_canCreateBrick(device):
if not device or device.kids > 0 or device.format.type or \
   hasattr(device.format, 'mountpoint') or \
   device.type in ['cdrom', 'lvmvg', 'lvmthinpool', 'lvmlv',
'lvmthinlv']:
return False
return True


> Best Regards,
> Strahil Nikolov
>
> В понеделник, 20 май 2019 г., 14:56:11 ч. Гринуич+3, Sahina Bose <
> sab...@redhat.com> написа:
>
>
> To scale existing volumes - you need to add bricks and run rebalance on
> the gluster volume so that data is correctly redistributed as Alex
> mentioned.
> We do support expanding existing volumes as the bug
> https://bugzilla.redhat.com/show_bug.cgi?id=1471031 has been fixed
>
> As to procedure to expand volumes:
> 1. Create bricks from UI - select Host -> Storage Device -> Storage
> device. Click on "Create Brick"
> If the device is shown as locked, make sure there's no signature on
> device.  If multipath entries have been created for local devices, you can
> blacklist those devices in multipath.conf and restart multipath.
> (If you see device as locked even after you do this -please report back).
> 2. Expand volume using Volume -> Bricks -> Add Bricks, and select the 3
> bricks created in previous step
> 3. Run Rebalance on the volume. Volume -> Rebalance.
>
>
> On Thu, May 16, 2019 at 2:48 PM Fred Rolland  wrote:
>
> Sahina,
> Can someone from your team review the steps done by Adrian?
> Thanks,
> Freddy
>
> On Thu, Apr 25, 2019 at 5:14 PM Adrian Quintero 
> wrote:
>
> Ok, I will remove the extra 3 hosts, rebuild them from scratch and
> re-attach them to clear any possible issues and try out the suggestions
> provided.
>
> thank you!
>
> On Thu, Apr 25, 2019 at 9:22 AM Strahil Nikolov 
> wrote:
>
> I have the same locks , despite I have blacklisted all local disks:
>
> # VDSM PRIVATE
> blacklist {
> devnode "*"
> wwid Crucial_CT256MX100SSD1_14390D52DCF5
> wwid WDC_WD5000AZRX-00A8LB0_WD-WCC1U0056126
> wwid WDC_WD5003ABYX-01WERA0_WD-WMAYP2335378
> wwid
> nvme.1cc1-324a31313230303131353936-414441544120535838323030504e50-0001
> }
>
> If you have multipath reconfigured, do not forget to rebuild the initramfs
> (dracut -f). It's a linux issue , and not oVirt one.
>
> In your case you had something like this:
>/dev/VG/LV
>   /dev/disk/by-id/pvuuid
>  /dev/mapper/multipath-uuid
> /dev/sdb
>
> Linux will not allow you to work with /dev/sdb , when multipath is locking
> the block device.
>
> Best Regards,
> Strahil Nikolov
>
> В четвъртък, 25 април 2019 г., 8:30:16 ч. Гринуич-4, Adrian Quintero <
> adrianquint...@gmail.com> написа:
>
>
> under Compute, hosts, select the host that has the locks on /dev/sdb,
> /dev/sdc, etc.., select storage devices and in here is where you see a
> small column with a bunch of lock images showing for each row.
>
>
> However as a work around, on the newly added hosts (3 total), I had to
> manually modify /etc/multipath.conf and add the following at the end as
> this is what I noticed from the original 3 node setup.
>
> -
> # VDSM REVISION 1.3
> # VDSM PRIVATE
> # BEGIN Added by gluster_hci role
>
> blacklist {
> devnode "*"
> }
> # END Added by gluster_hci role
> --
> After this I restarted multipath and the lock went away and was able to
> configure the new bricks thru the UI, however my concern is what will
> happen if I reboot the server will the disks be read the same way by the OS?
>
> Also now able to expand the gluster with a new replicate 3 volume if
> needed using http://host4.mydomain.com:9090.
>
>
> thanks again
>
> On Thu, Apr 25, 2019 at 8:00 AM Strahil Nikolov 
> wrote:
>
> In which menu do you see it this way ?
>
> Best Regards,
> Strahil Nikolov
>
> В сряда, 24 април 2019 г., 8:55:22 ч. Гринуич-4, Adrian Quintero <
> adrianquint...@gmail.com> написа:
>
>
> Strahil,
> this is the issue I am seeing now
>
> [image: image.png]
>
> The is thru the UI when I try to create a new brick.
>
> So my concern is if I modify the filters on the OS what impact will that
> have after server reboots?
>
> thanks,
>
>
>
> On Mon, Apr 22, 2019 at 11:39 PM Strahil  wrote:
>
> I have edited my multipath.conf to exclude local disks , but you need to
> set '#VDSM private' as per the comments in the header of the file.
> Otherwise, use the 

[ovirt-users] Re: Scale out ovirt 4.3 (from 3 to 6 or 9 nodes) with hyperconverged setup and Gluster

2019-05-21 Thread Sahina Bose
On Mon, May 20, 2019 at 9:55 PM Adrian Quintero 
wrote:

> Sahina,
> Yesterday I started with a fresh install, I completely wiped clean all the
> disks, recreated the arrays from within my controller of our DL380 Gen 9's.
>
> OS: RAID 1 (2x600GB HDDs): /dev/sda// Using ovirt node 4.3.3.1 iso.
> engine and VMSTORE1: JBOD (1x3TB HDD):/dev/sdb
> DATA1: JBOD (1x3TB HDD): /dev/sdc
> DATA2: JBOD (1x3TB HDD): /dev/sdd
> Caching disk: JOBD (1x440GB SDD): /dev/sde
>
> *After the OS install on the first 3 servers and setting up ssh keys,  I
> started the Hyperconverged deploy process:*
> 1.-Logged int to the first server http://host1.example.com:9090
> 2.-Selected Hyperconverged, clicked on "Run Gluster Wizard"
> 3.-Followed the wizard steps (Hosts, FQDNs, Packages, Volumes, Bricks,
> Review)
> *Hosts/FQDNs:*
> host1.example.com
> host2.example.com
> host3.example.com
> *Packages:*
> *Volumes:*
> engine:replicate:/gluster_bricks/engine/engine
> vmstore1:replicate:/gluster_bricks/vmstore1/vmstore1
> data1:replicate:/gluster_bricks/data1/data1
> data2:replicate:/gluster_bricks/data2/data2
> *Bricks:*
> engine:/dev/sdb:100GB:/gluster_bricks/engine
> vmstore1:/dev/sdb:2600GB:/gluster_bricks/vmstrore1
> data1:/dev/sdc:2700GB:/gluster_bricks/data1
> data2:/dev/sdd:2700GB:/gluster_bricks/data2
> LV Cache:
> /dev/sde:400GB:writethrough
> 4.-After I hit deploy on the last step of the "Wizard" that is when I get
> the disk filter error.
> TASK [gluster.infra/roles/backend_setup : Create volume groups]
> 
> failed: [vmm10.virt.iad3p] (item={u'vgname': u'gluster_vg_sdb', u'pvname':
> u'/dev/sdb'}) => {"changed": false, "err": "  Device /dev/sdb excluded by a
> filter.\n", "item": {"pvname": "/dev/sdb", "vgname": "gluster_vg_sdb"},
> "msg": "Creating physical volume '/dev/sdb' failed", "rc": 5}
> failed: [vmm12.virt.iad3p] (item={u'vgname': u'gluster_vg_sdb', u'pvname':
> u'/dev/sdb'}) => {"changed": false, "err": "  Device /dev/sdb excluded by a
> filter.\n", "item": {"pvname": "/dev/sdb", "vgname": "gluster_vg_sdb"},
> "msg": "Creating physical volume '/dev/sdb' failed", "rc": 5}
> failed: [vmm11.virt.iad3p] (item={u'vgname': u'gluster_vg_sdb', u'pvname':
> u'/dev/sdb'}) => {"changed": false, "err": "  Device /dev/sdb excluded by a
> filter.\n", "item": {"pvname": "/dev/sdb", "vgname": "gluster_vg_sdb"},
> "msg": "Creating physical volume '/dev/sdb' failed", "rc": 5}
> failed: [vmm12.virt.iad3p] (item={u'vgname': u'gluster_vg_sdc', u'pvname':
> u'/dev/sdc'}) => {"changed": false, "err": "  Device /dev/sdc excluded by a
> filter.\n", "item": {"pvname": "/dev/sdc", "vgname": "gluster_vg_sdc"},
> "msg": "Creating physical volume '/dev/sdc' failed", "rc": 5}
> failed: [vmm10.virt.iad3p] (item={u'vgname': u'gluster_vg_sdc', u'pvname':
> u'/dev/sdc'}) => {"changed": false, "err": "  Device /dev/sdc excluded by a
> filter.\n", "item": {"pvname": "/dev/sdc", "vgname": "gluster_vg_sdc"},
> "msg": "Creating physical volume '/dev/sdc' failed", "rc": 5}
> failed: [vmm11.virt.iad3p] (item={u'vgname': u'gluster_vg_sdc', u'pvname':
> u'/dev/sdc'}) => {"changed": false, "err": "  Device /dev/sdc excluded by a
> filter.\n", "item": {"pvname": "/dev/sdc", "vgname": "gluster_vg_sdc"},
> "msg": "Creating physical volume '/dev/sdc' failed", "rc": 5}
> failed: [vmm10.virt.iad3p] (item={u'vgname': u'gluster_vg_sdd', u'pvname':
> u'/dev/sdd'}) => {"changed": false, "err": "  Device /dev/sdd excluded by a
> filter.\n", "item": {"pvname": "/dev/sdd", "vgname": "gluster_vg_sdd"},
> "msg": "Creating physical volume '/dev/sdd' failed", "rc": 5}
> failed: [vmm12.virt.iad3p] (item={u'vgname': u'gluster_vg_sdd', u'pvname':
> u'/dev/sdd'}) => {"changed": false, "err": "  Device /dev/sdd excluded by a
> filter.\n", "item": {"pvname": "/dev/sdd", "vgname": "gluster_vg_sdd"},
> "msg": "Creating physical volume '/dev/sdd' failed", "rc": 5}
> failed: [vmm11.virt.iad3p] (item={u'vgname': u'gluster_vg_sdd', u'pvname':
> u'/dev/sdd'}) => {"changed": false, "err": "  Device /dev/sdd excluded by a
> filter.\n", "item": {"pvname": "/dev/sdd", "vgname": "gluster_vg_sdd"},
> "msg": "Creating physical volume '/dev/sdd' failed", "rc": 5}
>
> Attached is the generated yml file ( /etc/ansible/hc_wizard_inventory.yml)
> and the "Deployment Failed" file
>
>
>
>
> Also wondering if I hit this bug?
> https://bugzilla.redhat.com/show_bug.cgi?id=1635614
>
>
+Sachidananda URS  +Gobinda Das  to
review the inventory file and failures


>
> Thanks for looking into this.
>
> *Adrian Quintero*
> *adrianquint...@gmail.com  |
> adrian.quint...@rackspace.com *
>
>
> On Mon, May 20, 2019 at 7:56 AM Sahina Bose  wrote:
>
>> To scale existing volumes - you need to add bricks and run rebalance on
>> the gluster volume so that data is correctly redistributed as Alex
>> mentioned.
>> We do support expanding existing volumes as the bug
>> https://bugzilla.redhat.com/show_bug.cgi?id=1471031 has been fixed
>>
>> As to procedure to expand volumes:
>> 1. Create bricks from UI -