[ovirt-users] Re: Upgrading from 4.2.8 to 4.3.3 broke Node NG GlusterFS
After rebooting the node that was not able to mount the gluster volume things improved eventually. SPM went away and restarted for the Datacenter and suddenly node03 was able to mount the gluster volume. In between I was down to 1/3 active Bricks which results in read only glusterfs. I was lucky to have the Engine still on NFS. But anyway... Thanks for your thoughts. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/5NKHRCJWEZGXSBKRMR447RCX6GWAAZV6/
[ovirt-users] Re: Upgrading from 4.2.8 to 4.3.3 broke Node NG GlusterFS
Fix those disconnectes node and run find against a node that has successfully mounted the volume. Best Regards, Strahil NikolovOn Apr 24, 2019 15:31, Andreas Elvers wrote: > > The file handle is stale so find will display: > > "find: > '/rhev/data-center/mnt/glusterSD/node01.infra.solutions.work:_vmstore': > Transport endpoint is not connected" > > "stat /rhev/data-center/mnt/glusterSD/node01.infra.solutions.work:_vmstore" > will output > stat: cannot stat > '/rhev/data-center/mnt/glusterSD/node01.infra.solutions.work:_vmstore': > Transport endpoint is not connected > > All Nodes are peering with the other nodes: > - > Saiph:~ andreas$ ssh node01 gluster peer status > Number of Peers: 2 > > Hostname: node02.infra.solutions.work > Uuid: 87fab40a-2395-41ce-857d-0b846e078cdb > State: Peer in Cluster (Connected) > > Hostname: node03.infra.solutions.work > Uuid: 49025f81-e7c1-4760-be03-f36e0f403d26 > State: Peer in Cluster (Connected) > > Saiph:~ andreas$ ssh node02 gluster peer status > Number of Peers: 2 > > Hostname: node03.infra.solutions.work > Uuid: 49025f81-e7c1-4760-be03-f36e0f403d26 > State: Peer in Cluster (Disconnected) > > Hostname: node01.infra.solutions.work > Uuid: f25e6bff-e5e2-465f-a33e-9148bef94633 > State: Peer in Cluster (Connected) > > ssh node03 gluster peer status > Number of Peers: 2 > > Hostname: node02.infra.solutions.work > Uuid: 87fab40a-2395-41ce-857d-0b846e078cdb > State: Peer in Cluster (Connected) > > Hostname: node01.infra.solutions.work > Uuid: f25e6bff-e5e2-465f-a33e-9148bef94633 > State: Peer in Cluster (Connected) > ___ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/DI6AWTLIQIPWNK2M7PBABQ4TAPB4J3S3/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/GE2WD7UOHGBSZDF7DRNEL7HHHUZZJQOP/
[ovirt-users] Re: Upgrading from 4.2.8 to 4.3.3 broke Node NG GlusterFS
The file handle is stale so find will display: "find: '/rhev/data-center/mnt/glusterSD/node01.infra.solutions.work:_vmstore': Transport endpoint is not connected" "stat /rhev/data-center/mnt/glusterSD/node01.infra.solutions.work:_vmstore" will output stat: cannot stat '/rhev/data-center/mnt/glusterSD/node01.infra.solutions.work:_vmstore': Transport endpoint is not connected All Nodes are peering with the other nodes: - Saiph:~ andreas$ ssh node01 gluster peer status Number of Peers: 2 Hostname: node02.infra.solutions.work Uuid: 87fab40a-2395-41ce-857d-0b846e078cdb State: Peer in Cluster (Connected) Hostname: node03.infra.solutions.work Uuid: 49025f81-e7c1-4760-be03-f36e0f403d26 State: Peer in Cluster (Connected) Saiph:~ andreas$ ssh node02 gluster peer status Number of Peers: 2 Hostname: node03.infra.solutions.work Uuid: 49025f81-e7c1-4760-be03-f36e0f403d26 State: Peer in Cluster (Disconnected) Hostname: node01.infra.solutions.work Uuid: f25e6bff-e5e2-465f-a33e-9148bef94633 State: Peer in Cluster (Connected) ssh node03 gluster peer status Number of Peers: 2 Hostname: node02.infra.solutions.work Uuid: 87fab40a-2395-41ce-857d-0b846e078cdb State: Peer in Cluster (Connected) Hostname: node01.infra.solutions.work Uuid: f25e6bff-e5e2-465f-a33e-9148bef94633 State: Peer in Cluster (Connected) ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/DI6AWTLIQIPWNK2M7PBABQ4TAPB4J3S3/
[ovirt-users] Re: Upgrading from 4.2.8 to 4.3.3 broke Node NG GlusterFS
Try to run a find from a working server(for example node02): find /rhev/data-center/mnt/glusterSD/node01.infra.solutions.work:_vmstore -exec stat {} \; Also, check if all peers see each other. Best Regards,Strahil Nikolov В сряда, 24 април 2019 г., 3:27:41 ч. Гринуич-4, Andreas Elvers написа: Hi, I am currently upgrading my oVirt setup from 4.2.8 to 4.3.3.1. The setup consists of: Datacenter/Cluster Default: [fully upgraded to 4.3.3.1] 2 nodes (node04,node05)- NFS storage domain with self hosted engine Datacenter Luise: Cluster1: 3 nodes (node01,node02,node03) - Node NG with GlusterFS - Ceph Cinder storage domain [Node1 and Node3 are upgraded to 4.3.3.1, Node2 is on 4.2.8] Cluster2: 1 node (node06) - only Ceph Cinder storage domain [fully upgraded to 4.3.3.1] Problems started when upgrading Luise/Cluster1 with GlusterFS: (I always waited for GlusterFS to be fully synced before proceeding to the next step) - Upgrade node01 to 4.3.3 -> OK - Upgrade node03 to 4.3.3.1 -> OK - Upgrade node01 to 4.3.3.1 -> GlusterFS became unstable. I now get the error message: VDSM node03.infra.solutions.work command ConnectStoragePoolVDS failed: Cannot find master domain: u'spUUID=f3218bf7-6158-4b2b-b272-51cdc3280376, msdUUID=02a32017-cbe6-4407-b825-4e558b784157' And on node03 there is a problem with Gluster: node03#: ls -l /rhev/data-center/mnt/glusterSD/node01.infra.solutions.work:_vmstore ls: cannot access /rhev/data-center/mnt/glusterSD/node01.infra.solutions.work:_vmstore: Transport endpoint is not connected The directory is available on node01 and node02. The engine is reporting the brick on node03 as down. Node03 and Node06 are shown as NonOperational, because they are not able to access the gluster storage domain. A “gluster peer status” on node1, node2, and node3 shows all peers connected. “gluster volume heal vmstore info” shows for all nodes: gluster volume heal vmstore info Brick node01.infra.solutions.work:/gluster_bricks/vmstore/vmstore Status: Transport endpoint is not connected Number of entries: - Brick node02.infra.solutions.work:/gluster_bricks/vmstore/vmstore /02a32017-cbe6-4407-b825-4e558b784157/dom_md/ids /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.66 /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.60 /02a32017-cbe6-4407-b825-4e558b784157/images/a3a10398-9698-4b73-84d9-9735448e3534/6161e310-4ad6-42d9-8117-5a89c5b2b4b6 /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.96 /.shard/d66880de-3fa1-4362-8c43-574a173c5f7d.133 /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.38 /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.67 /__DIRECT_IO_TEST__ /02a32017-cbe6-4407-b825-4e558b784157/images/493188b2-c137-4440-99ee-43a753842a7d/9aa2d139-e3bd-406b-8fe0-b189123eaa73 /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.64 /.shard/d66880de-3fa1-4362-8c43-574a173c5f7d.132 /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.44 /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.9 /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.69 /02a32017-cbe6-4407-b825-4e558b784157/images/12e647fb-20aa-4957-b659-05fa75a9215e/f7e4b2a3-ab84-4eb5-a4e7-7208ddad8156 /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.35 /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.32 /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.39 /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.34 /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.68 Status: Connected Number of entries: 47 Brick node03.infra.solutions.work:/gluster_bricks/vmstore/vmstore /02a32017-cbe6-4407-b825-4e558b784157/images/12e647fb-20aa-4957-b659-05fa75a9215e/f7e4b2a3-ab84-4eb5-a4e7-7208ddad8156 /.shard/d66880de-3fa1-4362-8c43-574a173c5f7d.133 /02a32017-cbe6-4407-b825-4e558b784157/images/493188b2-c137-4440-99ee-43a753842a7d/9aa2d139-e3bd-406b-8fe0-b189123eaa73 /.shard/40948f85-2212-47f9-bd5e-102a8dd632b8.44 /02a32017-cbe6-4407-b825-4e558b784157/dom_md/ids /02a32017-cbe6-4407-b825-4e558b784157/images/a3a10398-9698-4b73-84d9-9735448e3534/6161e310-4ad6-42d9-8117-5a89c5b2b4b6 /.shard/d66880de-3fa1-4362-8c43-574a173c5f7d.132 /__DIRECT_IO_TEST__ Status: Connected Number of entries: 47 On Node03 there are several self healing processes, that seem to be doing nothing. Oh well.. What now? Best regards, - Andreas ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/R5GS6AQXTEQRMUQNMEBDC72YG3A5JFF6/ ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives:
[ovirt-users] Re: Upgrading from 4.2.8 to 4.3.3 broke Node NG GlusterFS
Restarting improved things a little bit. Still bricks on node03 are shown as down, but "gluster volume status" is looking better. Saiph:~ andreas$ ssh node01 gluster volume status vmstore Status of volume: vmstore Gluster process TCP Port RDMA Port Online Pid -- Brick node01.infra.solutions.work:/gluster_ bricks/vmstore/vmstore 49157 0 Y 24543 Brick node02.infra.solutions.work:/gluster_ bricks/vmstore/vmstore 49154 0 Y 23795 Brick node03.infra.solutions.work:/gluster_ bricks/vmstore/vmstore 49157 0 Y 1617 Self-heal Daemon on localhost N/A N/AY 32121 Self-heal Daemon on node03.infra.solutions. workN/A N/AY 25798 Self-heal Daemon on node02.infra.solutions. workN/A N/AY 30879 Task Status of Volume vmstore -- There are no active volume tasks Saiph:~ andreas$ ssh node02 gluster volume status vmstore Status of volume: vmstore Gluster process TCP Port RDMA Port Online Pid -- Brick node01.infra.solutions.work:/gluster_ bricks/vmstore/vmstore 49157 0 Y 24543 Brick node02.infra.solutions.work:/gluster_ bricks/vmstore/vmstore 49154 0 Y 23795 Brick node03.infra.solutions.work:/gluster_ bricks/vmstore/vmstore 49157 0 Y 1617 Self-heal Daemon on localhost N/A N/AY 30879 Self-heal Daemon on node03.infra.solutions. workN/A N/AY 25798 Self-heal Daemon on node01.infra.solutions. workN/A N/AY 32121 Task Status of Volume vmstore -- There are no active volume tasks Saiph:~ andreas$ ssh node03 gluster volume status vmstore Status of volume: vmstore Gluster process TCP Port RDMA Port Online Pid -- Brick node01.infra.solutions.work:/gluster_ bricks/vmstore/vmstore 49157 0 Y 24543 Brick node02.infra.solutions.work:/gluster_ bricks/vmstore/vmstore 49154 0 Y 23795 Brick node03.infra.solutions.work:/gluster_ bricks/vmstore/vmstore 49157 0 Y 1617 Self-heal Daemon on localhost N/A N/AY 25798 Self-heal Daemon on node01.infra.solutions. workN/A N/AY 32121 Self-heal Daemon on node02.infra.solutions. workN/A N/AY 30879 Task Status of Volume vmstore -- There are no active volume tasks ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/AAGJ42GF267NOFEQXNJRUQJD7C5UCOM5/
[ovirt-users] Re: Upgrading from 4.2.8 to 4.3.3 broke Node NG GlusterFS
"systemctl restart glusterd" on node03 did not help. Still getting: node03#: ls -l /rhev/data-center/mnt/glusterSD/node01.infra.solutions.work:_vmstore ls: cannot access /rhev/data-center/mnt/glusterSD/node01.infra.solutions.work:_vmstore: Transport endpoint is not connected Engine still shows bricks on node03 as down. ___ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/2JMM4UBZ54TNHCFUFYDX2OOVCKEMXFBX/