[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-11 Thread Strahil Nikolov via Users
Just move it away (to be on the safe side) and trigger a full heal.

Best Regards,
Strahil Nikolov






В сряда, 10 март 2021 г., 13:01:21 ч. Гринуич+2, Maria Souvalioti 
 написа: 






Should I delete the file and restart glusterd on the ov-no1 server?




Thank you very much




On 3/10/21 10:21 AM, Strahil Nikolov via Users wrote:


>  
It seems to me that ov-no1 didn't update the file properly. 



What was the output of the gluster volume heal command ?




Best Regards,

Strahil Nikolov


>  
>  
> 
>  The output of the getfattr command on the nodes was the following:
> 
> Node1:
> [root@ov-no1 ~]# getfattr -d -m . -e hex 
> /gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> getfattr: Removing leading '/' from absolute path names
> # file: 
> gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
> trusted.afr.dirty=0x0394
> trusted.afr.engine-client-2=0x
> trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
> trusted.gfid2path.06f4f1065c7ed193=0x36313936323032302d386431342d343261372d613565332d3233346365656635343035632f61343835353566342d626532332d343436372d386135342d343030616537626166396437
> trusted.glusterfs.mdata=0x015fec62872f5849585fec62872f5849585d791c1a00ba286e
> trusted.glusterfs.shard.block-size=0x0400
> trusted.glusterfs.shard.file-size=0x00190092040b
> 
> 
> Node2:
> [root@ov-no2 ~]#  getfattr -d -m . -e hex 
> /gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> getfattr: Removing leading '/' from absolute path names
> # file: 
> gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
> trusted.afr.dirty=0x
> trusted.afr.engine-client-0=0x043a
> trusted.afr.engine-client-2=0x
> trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
> trusted.gfid2path.06f4f1065c7ed193=0x36313936323032302d386431342d343261372d613565332d3233346365656635343035632f61343835353566342d626532332d343436372d386135342d343030616537626166396437
> trusted.glusterfs.mdata=0x015fec62872f5849585fec62872f5849585d791c1a00ba286e
> trusted.glusterfs.shard.block-size=0x0400
> trusted.glusterfs.shard.file-size=0x00190092040b
> 
> 
> Node3:
> [root@ov-no3 ~]#  getfattr -d -m . -e hex 
> /gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> getfattr: Removing leading '/' from absolute path names
> # file: 
> gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
> trusted.afr.dirty=0x
> trusted.afr.engine-client-0=0x0444
> trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
> trusted.gfid2path.06f4f1065c7ed193=0x36313936323032302d386431342d343261372d613565332d3233346365656635343035632f61343835353566342d626532332d343436372d386135342d343030616537626166396437
> trusted.glusterfs.mdata=0x015fec62872f5849585fec62872f5849585d791c1a00ba286e
> trusted.glusterfs.shard.block-size=0x0400
> trusted.glusterfs.shard.file-size=0x00190092040b
>  
> 
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/PUVBESAIZEJ7URDMDQ7LDUPNS6YDBVAS/
>  
> 
> 
> 
> 




___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/R3ODLVEODDFWP3IVLPFNQXNLBCPPSZTR/

[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-11 Thread Strahil Nikolov via Users
It seems that the affected file can be moved away on ov-no1.ariadne-t.local, as 
the other 2 bricks "blame" the entry on ov-no1.ariadne-t.local .
After that , you will need to "gluster volume heal  full" to 
trigger the heal.

Best Regards,
Strahil Nikolov 






В сряда, 10 март 2021 г., 12:58:10 ч. Гринуич+2, Maria Souvalioti 
 написа: 






The gluster volume heal engine command didn't output anything in the CLI.




The gluster volume heal engine info gives:





# gluster volume heal engine info
Brick ov-no1.ariadne-t.local:/gluster_bricks/engine/engine
Status: Connected
Number of entries: 0

Brick ov-no2.ariadne-t.local:/gluster_bricks/engine/engine
/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
 
Status: Connected
Number of entries: 1

Brick ov-no3.ariadne-t.local:/gluster_bricks/engine/engine
/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
 
Status: Connected
Number of entries: 1   





And gluster volume heal engine info summary gives:  
   


    
# gluster volume heal engine info summary
Brick ov-no1.ariadne-t.local:/gluster_bricks/engine/engine
Status: Connected
Total Number of entries: 1
Number of entries in heal pending: 1
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick ov-no2.ariadne-t.local:/gluster_bricks/engine/engine
Status: Connected
Total Number of entries: 1
Number of entries in heal pending: 1
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick ov-no3.ariadne-t.local:/gluster_bricks/engine/engine
Status: Connected
Total Number of entries: 1
Number of entries in heal pending: 1
Number of entries in split-brain: 0
Number of entries possibly healing: 0





Also I found the following warning message in the logs that has been repeating 
itself since the problem started:

[2021-03-10 10:08:11.646824] W [MSGID: 114061] 
[client-common.c:2644:client_pre_fsync_v2] 0-engine-client-0:  
(3fafabf3-d0cd-4b9a-8dd7-43145451f7cf) remote_fd is -1. EBADFD [File descriptor 
in bad state]




And from what I see in the logs, the healing process seems to be still trying 
to fix the volume. 





[2021-03-10 10:47:34.820229] I [MSGID: 108026] 
[afr-self-heal-common.c:1741:afr_log_selfheal] 0-engine-replicate-0: Completed 
data selfheal on 3fafabf3-d0cd-4b9a-8dd7-43145451f7cf. sources=1 [2]  sinks=0 
The message "I [MSGID: 108026] [afr-self-heal-common.c:1741:afr_log_selfheal] 
0-engine-replicate-0: Completed data selfheal on 
3fafabf3-d0cd-4b9a-8dd7-43145451f7cf. sources=1 [2]  sinks=0 " repeated 8 times 
between [2021-03-10 10:47:34.820229] and [2021-03-10 10:48:00.088805]









On 3/10/21 10:21 AM, Strahil Nikolov via Users wrote:


>  
It seems to me that ov-no1 didn't update the file properly. 



What was the output of the gluster volume heal command ?




Best Regards,

Strahil Nikolov


>  
>  
> 
>  The output of the getfattr command on the nodes was the following:
> 
> Node1:
> [root@ov-no1 ~]# getfattr -d -m . -e hex 
> /gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> getfattr: Removing leading '/' from absolute path names
> # file: 
> gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
> trusted.afr.dirty=0x0394
> trusted.afr.engine-client-2=0x
> trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
> trusted.gfid2path.06f4f1065c7ed193=0x36313936323032302d386431342d343261372d613565332d3233346365656635343035632f61343835353566342d626532332d343436372d386135342d343030616537626166396437
> trusted.glusterfs.mdata=0x015fec62872f5849585fec62872f5849585d791c1a00ba286e
> trusted.glusterfs.shard.block-size=0x0400
> trusted.glusterfs.shard.file-size=0x00190092040b
> 
> 
> Node2:
> [root@ov-no2 ~]#  getfattr -d -m . -e hex 
> /gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> getfattr: Removing leading '/' from absolute path names
> # file: 
> gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> 

[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-10 Thread Maria Souvalioti
Should I delete the file and restart glusterd on the ov-no1 server?


Thank you very much


On 3/10/21 10:21 AM, Strahil Nikolov via Users wrote:
> It seems to me that ov-no1 didn't update the file properly.
>
> What was the output of the gluster volume heal command ?
>
> Best Regards,
> Strahil Nikolov
>
> The output of the getfattr command on the nodes was the following:
>
> Node1:
> [root@ov-no1  ~]# getfattr -d -m . -e hex
> 
> /gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> getfattr: Removing leading '/' from absolute path names
> # file:
> 
> gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> 
> security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
> trusted.afr.dirty=0x0394
> trusted.afr.engine-client-2=0x
> trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
> 
> trusted.gfid2path.06f4f1065c7ed193=0x36313936323032302d386431342d343261372d613565332d3233346365656635343035632f61343835353566342d626532332d343436372d386135342d343030616537626166396437
> 
> trusted.glusterfs.mdata=0x015fec62872f5849585fec62872f5849585d791c1a00ba286e
> trusted.glusterfs.shard.block-size=0x0400
> 
> trusted.glusterfs.shard.file-size=0x00190092040b
>
>
> Node2:
> [root@ov-no2  ~]#  getfattr -d -m . -e hex
> 
> /gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> getfattr: Removing leading '/' from absolute path names
> # file:
> 
> gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> 
> security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
> trusted.afr.dirty=0x
> trusted.afr.engine-client-0=0x043a
> trusted.afr.engine-client-2=0x
> trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
> 
> trusted.gfid2path.06f4f1065c7ed193=0x36313936323032302d386431342d343261372d613565332d3233346365656635343035632f61343835353566342d626532332d343436372d386135342d343030616537626166396437
> 
> trusted.glusterfs.mdata=0x015fec62872f5849585fec62872f5849585d791c1a00ba286e
> trusted.glusterfs.shard.block-size=0x0400
> 
> trusted.glusterfs.shard.file-size=0x00190092040b
>
>
> Node3:
> [root@ov-no3  ~]#  getfattr -d -m . -e hex
> 
> /gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> getfattr: Removing leading '/' from absolute path names
> # file:
> 
> gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> 
> security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
> trusted.afr.dirty=0x
> trusted.afr.engine-client-0=0x0444
> trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
> 
> trusted.gfid2path.06f4f1065c7ed193=0x36313936323032302d386431342d343261372d613565332d3233346365656635343035632f61343835353566342d626532332d343436372d386135342d343030616537626166396437
> 
> trusted.glusterfs.mdata=0x015fec62872f5849585fec62872f5849585d791c1a00ba286e
> trusted.glusterfs.shard.block-size=0x0400
> 
> trusted.glusterfs.shard.file-size=0x00190092040b
>
>
> ___
> Users mailing list -- users@ovirt.org 
> To unsubscribe send an email to users-le...@ovirt.org
> 
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> 
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/PUVBESAIZEJ7URDMDQ7LDUPNS6YDBVAS/
>
>
>
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> 

[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-10 Thread Maria Souvalioti
The gluster volume heal engine command didn't output anything in the CLI.


The gluster volume heal engine info gives:


# gluster volume heal engine info
Brick ov-no1.ariadne-t.local:/gluster_bricks/engine/engine
Status: Connected
Number of entries: 0

Brick ov-no2.ariadne-t.local:/gluster_bricks/engine/engine
/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7

Status: Connected
Number of entries: 1

Brick ov-no3.ariadne-t.local:/gluster_bricks/engine/engine
/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7

Status: Connected
Number of entries: 1  


And gluster volume heal engine info summary gives:

   



   

# gluster volume heal engine info summary
Brick ov-no1.ariadne-t.local:/gluster_bricks/engine/engine
Status: Connected
Total Number of entries: 1
Number of entries in heal pending: 1
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick ov-no2.ariadne-t.local:/gluster_bricks/engine/engine
Status: Connected
Total Number of entries: 1
Number of entries in heal pending: 1
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick ov-no3.ariadne-t.local:/gluster_bricks/engine/engine
Status: Connected
Total Number of entries: 1
Number of entries in heal pending: 1
Number of entries in split-brain: 0
Number of entries possibly healing: 0


Also I found the following warning message in the logs that has been
repeating itself since the problem started:

[2021-03-10 10:08:11.646824] W [MSGID: 114061]
[client-common.c:2644:client_pre_fsync_v2] 0-engine-client-0: 
(3fafabf3-d0cd-4b9a-8dd7-43145451f7cf) remote_fd is -1. EBADFD [File
descriptor in bad state]


And from what I see in the logs, the healing process seems to be still
trying to fix the volume.


[2021-03-10 10:47:34.820229] I [MSGID: 108026]
[afr-self-heal-common.c:1741:afr_log_selfheal] 0-engine-replicate-0:
Completed data selfheal on 3fafabf3-d0cd-4b9a-8dd7-43145451f7cf.
sources=1 [2]  sinks=0
The message "I [MSGID: 108026]
[afr-self-heal-common.c:1741:afr_log_selfheal] 0-engine-replicate-0:
Completed data selfheal on 3fafabf3-d0cd-4b9a-8dd7-43145451f7cf.
sources=1 [2]  sinks=0 " repeated 8 times between [2021-03-10
10:47:34.820229] and [2021-03-10 10:48:00.088805]



On 3/10/21 10:21 AM, Strahil Nikolov via Users wrote:
> It seems to me that ov-no1 didn't update the file properly.
>
> What was the output of the gluster volume heal command ?
>
> Best Regards,
> Strahil Nikolov
>
> The output of the getfattr command on the nodes was the following:
>
> Node1:
> [root@ov-no1  ~]# getfattr -d -m . -e hex
> 
> /gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> getfattr: Removing leading '/' from absolute path names
> # file:
> 
> gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> 
> security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
> trusted.afr.dirty=0x0394
> trusted.afr.engine-client-2=0x
> trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
> 
> trusted.gfid2path.06f4f1065c7ed193=0x36313936323032302d386431342d343261372d613565332d3233346365656635343035632f61343835353566342d626532332d343436372d386135342d343030616537626166396437
> 
> trusted.glusterfs.mdata=0x015fec62872f5849585fec62872f5849585d791c1a00ba286e
> trusted.glusterfs.shard.block-size=0x0400
> 
> trusted.glusterfs.shard.file-size=0x00190092040b
>
>
> Node2:
> [root@ov-no2  ~]#  getfattr -d -m . -e hex
> 
> /gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> getfattr: Removing leading '/' from absolute path names
> # file:
> 
> gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> 
> security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
> trusted.afr.dirty=0x
> trusted.afr.engine-client-0=0x043a
> trusted.afr.engine-client-2=0x
> trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
> 
> 

[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-10 Thread Strahil Nikolov via Users
It seems to me that ov-no1 didn't update the file properly.
What was the output of the gluster volume heal command ?
Best Regards,Strahil Nikolov
 
 
The output of the getfattr command on the nodes was the following:

Node1:
[root@ov-no1 ~]# getfattr -d -m . -e hex 
/gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
getfattr: Removing leading '/' from absolute path names
# file: 
gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x0394
trusted.afr.engine-client-2=0x
trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
trusted.gfid2path.06f4f1065c7ed193=0x36313936323032302d386431342d343261372d613565332d3233346365656635343035632f61343835353566342d626532332d343436372d386135342d343030616537626166396437
trusted.glusterfs.mdata=0x015fec62872f5849585fec62872f5849585d791c1a00ba286e
trusted.glusterfs.shard.block-size=0x0400
trusted.glusterfs.shard.file-size=0x00190092040b


Node2:
[root@ov-no2 ~]#  getfattr -d -m . -e hex 
/gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
getfattr: Removing leading '/' from absolute path names
# file: 
gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x
trusted.afr.engine-client-0=0x043a
trusted.afr.engine-client-2=0x
trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
trusted.gfid2path.06f4f1065c7ed193=0x36313936323032302d386431342d343261372d613565332d3233346365656635343035632f61343835353566342d626532332d343436372d386135342d343030616537626166396437
trusted.glusterfs.mdata=0x015fec62872f5849585fec62872f5849585d791c1a00ba286e
trusted.glusterfs.shard.block-size=0x0400
trusted.glusterfs.shard.file-size=0x00190092040b


Node3:
[root@ov-no3 ~]#  getfattr -d -m . -e hex 
/gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
getfattr: Removing leading '/' from absolute path names
# file: 
gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x
trusted.afr.engine-client-0=0x0444
trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
trusted.gfid2path.06f4f1065c7ed193=0x36313936323032302d386431342d343261372d613565332d3233346365656635343035632f61343835353566342d626532332d343436372d386135342d343030616537626166396437
trusted.glusterfs.mdata=0x015fec62872f5849585fec62872f5849585d791c1a00ba286e
trusted.glusterfs.shard.block-size=0x0400
trusted.glusterfs.shard.file-size=0x00190092040b
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PUVBESAIZEJ7URDMDQ7LDUPNS6YDBVAS/
  
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/R3ODLVEODDFWP3IVLPFNQXNLBCPPSZTR/


[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-09 Thread souvaliotimaria
The output of the getfattr command on the nodes was the following:

Node1:
[root@ov-no1 ~]# getfattr -d -m . -e hex 
/gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
getfattr: Removing leading '/' from absolute path names
# file: 
gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x0394
trusted.afr.engine-client-2=0x
trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
trusted.gfid2path.06f4f1065c7ed193=0x36313936323032302d386431342d343261372d613565332d3233346365656635343035632f61343835353566342d626532332d343436372d386135342d343030616537626166396437
trusted.glusterfs.mdata=0x015fec62872f5849585fec62872f5849585d791c1a00ba286e
trusted.glusterfs.shard.block-size=0x0400
trusted.glusterfs.shard.file-size=0x00190092040b


Node2:
[root@ov-no2 ~]#  getfattr -d -m . -e hex 
/gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
getfattr: Removing leading '/' from absolute path names
# file: 
gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x
trusted.afr.engine-client-0=0x043a
trusted.afr.engine-client-2=0x
trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
trusted.gfid2path.06f4f1065c7ed193=0x36313936323032302d386431342d343261372d613565332d3233346365656635343035632f61343835353566342d626532332d343436372d386135342d343030616537626166396437
trusted.glusterfs.mdata=0x015fec62872f5849585fec62872f5849585d791c1a00ba286e
trusted.glusterfs.shard.block-size=0x0400
trusted.glusterfs.shard.file-size=0x00190092040b


Node3:
[root@ov-no3 ~]#  getfattr -d -m . -e hex 
/gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
getfattr: Removing leading '/' from absolute path names
# file: 
gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x
trusted.afr.engine-client-0=0x0444
trusted.gfid=0x3fafabf3d0cd4b9a8dd743145451f7cf
trusted.gfid2path.06f4f1065c7ed193=0x36313936323032302d386431342d343261372d613565332d3233346365656635343035632f61343835353566342d626532332d343436372d386135342d343030616537626166396437
trusted.glusterfs.mdata=0x015fec62872f5849585fec62872f5849585d791c1a00ba286e
trusted.glusterfs.shard.block-size=0x0400
trusted.glusterfs.shard.file-size=0x00190092040b
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/PUVBESAIZEJ7URDMDQ7LDUPNS6YDBVAS/


[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-09 Thread Maria Souvalioti

Sorry I run the getfattr command wrongly.

I run it again as

getfattr -d -m . -e hex
/gluster_bricks/engine/engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7

on each node and I got different results on the following attributes:


-trusted.afr.dirty
  It is 0x0394 on node 1, and
0x on the other two

-trusted.afr.engine-client-0
It is 0x043a on node 2 and 3, but node 1 doesn't
have it at all.

-trusted.afr.engine-client-2
It is 0x on node 1 and
0x0444 on node 2.
Node 3 doesn't have this entry at all.


Hope this helps.

Thanks for your help



On 3/9/2021 9:11 PM, Strahil Nikolov via Users wrote:

The output of the command seems quite wierd:  'getfattr -d -m . -e
hex file'
Is it the same on all nodes ?

Best Regards,
Strahil Nikolov

On Tue, Mar 9, 2021 at 15:36, Maria Souvalioti
 wrote:
___
Users mailing list -- users@ovirt.org 
To unsubscribe send an email to users-le...@ovirt.org

Privacy Statement: https://www.ovirt.org/privacy-policy.html

oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/

List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/OHK2ZRG5OESS3OGFSBQTZ66B5HF5X6G3/




___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MBG4A2DTXL5HW3REBHITRHKONVK6XZLW/
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2E4W2D5LYGXZH4YBLRUY6CSKYLVFELJG/


[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-09 Thread Strahil Nikolov via Users
The output of the command seems quite wierd:  'getfattr -d -m . -e hex file' Is 
it the same on all nodes ?
Best Regards,Strahil Nikolov
 
 
  On Tue, Mar 9, 2021 at 15:36, Maria Souvalioti 
wrote:   ___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OHK2ZRG5OESS3OGFSBQTZ66B5HF5X6G3/
  
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MBG4A2DTXL5HW3REBHITRHKONVK6XZLW/


[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-09 Thread Maria Souvalioti

The commandgetfattr -n replica.split-brain-status  gives the
following:

[root@ov-no1 ~]# getfattr -n replica.split-brain-status
/rhev/data-center/mnt/glusterSD/ov-no1.ariadne-t.local\:_engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
getfattr: Removing leading '/' from absolute path names
# file:
rhev/data-center/mnt/glusterSD/ov-no1.ariadne-t.local:_engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
replica.split-brain-status="The file is not under data or metadata
split-brain"

And getfattr -d -m . -e hex  command gives :

[root@ov-no1 ~]# getfattr -d -m . -e hex
/rhev/data-center/mnt/glusterSD/ov-no1.ariadne-t.local\:_engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
getfattr: Removing leading '/' from absolute path names
# file:
rhev/data-center/mnt/glusterSD/ov-no1.ariadne-t.local:_engine/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
security.selinux=0x73797374656d5f753a6f626a6563745f723a6675736566735f743a733000

Also, from what I can tell, in the GUI the brick seems to still be in
the healing process (since I run the dd command yesterday), as the
counters in self-heal info field change over time.

Thank you for your help


On 3/9/2021 7:33 AM, Strahil Nikolov via Users wrote:

Also check the status of the file on each brick with the getfattr
command (
see https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/
) and provide the output.

Best Regards,
Strahil Nikolov

Thank you for your reply.
I'm trying that right now and I see it triggered the self-healing
process.
I will come back with an update.
Best regards.

___
Users mailing list -- users@ovirt.org 
To unsubscribe send an email to users-le...@ovirt.org

Privacy Statement: https://www.ovirt.org/privacy-policy.html

oVirt Code of Conduct:
https://www.ovirt.org/community/about/community-guidelines/

List Archives:

https://lists.ovirt.org/archives/list/users@ovirt.org/message/WKW4RAVHVOZN6CZVK2TOC7727DHLKWRZ/




___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/OHK2ZRG5OESS3OGFSBQTZ66B5HF5X6G3/


[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-08 Thread Strahil Nikolov via Users
Also check the status of the file on each brick with the getfattr command ( see 
https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/ ) and 
provide the output.
Best Regards,Strahil Nikolov
 
 
Thank you for your reply.
I'm trying that right now and I see it triggered the self-healing process. 
I will come back with an update.
Best regards.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WKW4RAVHVOZN6CZVK2TOC7727DHLKWRZ/
  
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/BENORJHFCW3XOX5ZP6ZJFQDXE2NPZGAI/


[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-08 Thread souvaliotimaria
Thank you for your reply.
I'm trying that right now and I see it triggered the self-healing process. 
I will come back with an update.
Best regards.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/WKW4RAVHVOZN6CZVK2TOC7727DHLKWRZ/


[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-08 Thread souvaliotimaria
Thank you. 
I have tried that and it didn't work as the system sees that the file is not in 
split-brain.
I have also tried force heal and full heal and still nothing. I always end up 
with the entry being stuck in unsynched stage.
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/W5AJ4PKEK36NZEIAPTX3UQD6P7EZM7EL/


[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-05 Thread Strahil Nikolov via Users
If it's a VM image, just use dd to read the whole file.dd 
if=VM_imageof=/dev/null bs=10M status=progress
Best Regards,Strahil Nikolov
 
 
  On Fri, Mar 5, 2021 at 15:48, Alex K wrote:   
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RJO7EVEW2C3P7EYTAIXZVIC7JBSEXM3C/
  
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/7Y6SXCZDIYQH3MST72CX5FCGZW5QQKMR/


[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-05 Thread Alex K
On Thu, Mar 4, 2021 at 8:59 PM  wrote:

> Hello again,
> I've tried to heal the brick with latest-mtime, but I get the following:
>
> gluster volume heal engine split-brain latest-mtime
> /80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> Healing
> /80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
> failed: File not in split-brain.
> Volume heal failed.
>
you can try to run ls at the directory where the file which healing is
pending resides.  This might trigger the healing process of that file.


> Should I try the solution described in this question, where we manually
> remove the conflicting entry, triggering the heal operations?
>
> https://lists.ovirt.org/archives/list/users@ovirt.org/thread/RPYIMSQCBYVQ654HYGBN5NCPRVCGRRYB/#H6EBSPL5XRLBUVZBE7DGSY25YFPIR2KY
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/CCRNM7N3FSUYXDHFP2XDMGAMKSHBMJQQ/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RJO7EVEW2C3P7EYTAIXZVIC7JBSEXM3C/


[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-04 Thread souvaliotimaria
Hello again, 
I've tried to heal the brick with latest-mtime, but I get the following:

gluster volume heal engine split-brain latest-mtime 
/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
Healing 
/80f6e393-9718-4738-a14a-64cf43c3d8c2/images/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7
 failed: File not in split-brain.
Volume heal failed.

Should I try the solution described in this question, where we manually remove 
the conflicting entry, triggering the heal operations? 
https://lists.ovirt.org/archives/list/users@ovirt.org/thread/RPYIMSQCBYVQ654HYGBN5NCPRVCGRRYB/#H6EBSPL5XRLBUVZBE7DGSY25YFPIR2KY
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/CCRNM7N3FSUYXDHFP2XDMGAMKSHBMJQQ/


[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-04 Thread souvaliotimaria
I tried only the simple healing because I wasn't sure if I'd mess the gluster 
more than it already is. 
I will try latest-mtime in a couple of hours because the system is a production 
system and I have to do it after office hours. I will come back with an update.
Thank you very much for your help!
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/YHI63SPSJG6MNAI6737LZXS5ZG5UPXAG/


[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-03 Thread Alex K
On Wed, Mar 3, 2021, 19:13  wrote:

> Hello,
>
> Thank you very much for your reply.
>
> I get the following from the below gluster commands:
>
> [root@ov-no1 ~]# gluster volume heal engine info split-brain
> Brick ov-no1.ariadne-t.local:/gluster_bricks/engine/engine
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick ov-no2.ariadne-t.local:/gluster_bricks/engine/engine
> Status: Connected
> Number of entries in split-brain: 0
>
> Brick ov-no3.ariadne-t.local:/gluster_bricks/engine/engine
> Status: Connected
> Number of entries in split-brain: 0
>
>
> [root@ov-no1 ~]# gluster volume heal engine info summary
> Brick ov-no1.ariadne-t.local:/gluster_bricks/engine/engine
> Status: Connected
> Total Number of entries: 1
> Number of entries in heal pending: 1
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick ov-no2.ariadne-t.local:/gluster_bricks/engine/engine
> Status: Connected
> Total Number of entries: 1
> Number of entries in heal pending: 1
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
> Brick ov-no3.ariadne-t.local:/gluster_bricks/engine/engine
> Status: Connected
> Total Number of entries: 1
> Number of entries in heal pending: 1
> Number of entries in split-brain: 0
> Number of entries possibly healing: 0
>
>
> [root@ov-no1 ~]# gluster volume info
> Volume Name: data
> Type: Replicate
> Volume ID: 6c7bb2e4-ed35-4826-81f6-34fcd2d0a984
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1:
> ov-no1.ariadne-t.local:/gluster_bricks/data/data
> Brick2:
> ov-no2.ariadne-t.local:/gluster_bricks/data/data
> Brick3:
> ov-no3.ariadne-t.local:/gluster_bricks/data/data (arbiter)
> Options Reconfigured:
> performance.client-io-threads: on
> nfs.disable: on
> transport.address-family: inet
> performance.strict-o-direct: on
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.low-prio-threads: 32
> network.remote-dio: off
> cluster.eager-lock: enable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> cluster.data-self-heal-algorithm: full
> cluster.locking-scheme: granular
> cluster.shd-max-threads: 8
> cluster.shd-wait-qlength: 1
> features.shard: on
> user.cifs: off
> cluster.choose-local: off
> client.event-threads: 4
> server.event-threads: 4
> network.ping-timeout: 30
> storage.owner-uid: 36
> storage.owner-gid: 36
> cluster.granular-entry-heal: enable
>
> Volume Name: engine
> Type: Replicate
> Volume ID: 7173c827-309f-4e84-a0da-6b2b8eb50264
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1:
> ov-no1.ariadne-t.local:/gluster_bricks/engine/engine
> Brick2:
> ov-no2.ariadne-t.local:/gluster_bricks/engine/engine
> Brick3:
> ov-no3.ariadne-t.local:/gluster_bricks/engine/engine
> Options Reconfigured:
> performance.client-io-threads: on
> nfs.disable: on
> transport.address-family: inet
> performance.strict-o-direct: on
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.low-prio-threads: 32
> 

[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-03 Thread souvaliotimaria
Hello,

Thank you very much for your reply.

I get the following from the below gluster commands:

[root@ov-no1 ~]# gluster volume heal engine info split-brain
Brick ov-no1.ariadne-t.local:/gluster_bricks/engine/engine
Status: Connected
Number of entries in split-brain: 0

Brick ov-no2.ariadne-t.local:/gluster_bricks/engine/engine
Status: Connected
Number of entries in split-brain: 0

Brick ov-no3.ariadne-t.local:/gluster_bricks/engine/engine
Status: Connected
Number of entries in split-brain: 0


[root@ov-no1 ~]# gluster volume heal engine info summary
Brick ov-no1.ariadne-t.local:/gluster_bricks/engine/engine
Status: Connected
Total Number of entries: 1
Number of entries in heal pending: 1
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick ov-no2.ariadne-t.local:/gluster_bricks/engine/engine
Status: Connected
Total Number of entries: 1
Number of entries in heal pending: 1
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick ov-no3.ariadne-t.local:/gluster_bricks/engine/engine
Status: Connected
Total Number of entries: 1
Number of entries in heal pending: 1
Number of entries in split-brain: 0
Number of entries possibly healing: 0


[root@ov-no1 ~]# gluster volume info
Volume Name: data
Type: Replicate
Volume ID: 6c7bb2e4-ed35-4826-81f6-34fcd2d0a984
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: ov-no1.ariadne-t.local:/gluster_bricks/data/data
Brick2: ov-no2.ariadne-t.local:/gluster_bricks/data/data
Brick3: 
ov-no3.ariadne-t.local:/gluster_bricks/data/data (arbiter)
Options Reconfigured:
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
performance.strict-o-direct: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.low-prio-threads: 32
network.remote-dio: off
cluster.eager-lock: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 1
features.shard: on
user.cifs: off
cluster.choose-local: off
client.event-threads: 4
server.event-threads: 4
network.ping-timeout: 30
storage.owner-uid: 36
storage.owner-gid: 36
cluster.granular-entry-heal: enable

Volume Name: engine
Type: Replicate
Volume ID: 7173c827-309f-4e84-a0da-6b2b8eb50264
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 
ov-no1.ariadne-t.local:/gluster_bricks/engine/engine
Brick2: 
ov-no2.ariadne-t.local:/gluster_bricks/engine/engine
Brick3: 
ov-no3.ariadne-t.local:/gluster_bricks/engine/engine
Options Reconfigured:
performance.client-io-threads: on
nfs.disable: on
transport.address-family: inet
performance.strict-o-direct: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.low-prio-threads: 32
network.remote-dio: off
cluster.eager-lock: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
cluster.data-self-heal-algorithm: 

[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-02 Thread Alex K
On Mon, Mar 1, 2021, 15:20  wrote:

> Hello again,
>
> I am back with a brief description of the situation I am in, and questions
> about the recovery.
>
> oVirt environment: 4.3.5.2 Hyperconverged
> GlusterFS: Replica 2 + Arbiter 1
> GlusterFS volumes: data, engine, vmstore
>
> The current situation is the following:
>
> - The Cluster is in Global Maintenance.
>
> - The volume engine is up with comment (in the Web GUI) : Up, unsynched
> entries, needs healing.
>
> - The VM HostedEngine is paused due to a storage I/O error (Web GUI) while
> the output of virsh list --all command shows that the HostedEngine is
> running.
>
> I tried to issue the gluster heal command (gluster volume heal engine) but
> nothing changed.
>
> I have the following questions:
>
> 1. Should I restart the glusterd service? Where from? Is it enough if the
> glusterd is restarted on one host or should it be restarted on the other
> two as well?
>
It sounds as a gluster split brain. I would start from there. Can you check
status by listing split brain entries?

>
> 2. Should the node that was NonResponsive and came back, be rebooted or
> not? It seems alright now and in good health.
>
> 3. Should the HostedEngine be restored with engine-backup or is it not
> necessary?
>
> 4. Could the loss of the DNS server for the oVirt hosts lead to an
> unresponsive host?
> The nsswitch file on the ovirt hosts and engine, has the DNS defined as:
> hosts:  files dns myhostname
>
If you have opted for dns liveliness checks it could be.

>
> 5. How can we recover/rectify the situation above?
>
I would start checking for gluster split brains and ensure that all hosts
have connectivity in the storage domain net (ping, jumbo frames if
enabled). 99% of my similar issues have been caused from gluster split.

The fact that the engine is shown as paused and that you can still access
web ui makes me think you have a split brain issue

>
> Thanks for your help,
> Maria Souvalioti
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/GO6S6GXRJWYZN5NZ5IFTNQ6SGNEB75WQ/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/ZNIFUDLRYHU3YTYC35OLXVVHYKAPNJZI/


[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-01 Thread Sandro Bonazzola
+Gobinda Das  , +Satheesaran Sundaramoorthi
 maybe you can help here

Il giorno lun 1 mar 2021 alle ore 14:20  ha
scritto:

> Hello again,
>
> I am back with a brief description of the situation I am in, and questions
> about the recovery.
>
> oVirt environment: 4.3.5.2 Hyperconverged
> GlusterFS: Replica 2 + Arbiter 1
> GlusterFS volumes: data, engine, vmstore
>
> The current situation is the following:
>
> - The Cluster is in Global Maintenance.
>
> - The volume engine is up with comment (in the Web GUI) : Up, unsynched
> entries, needs healing.
>
> - The VM HostedEngine is paused due to a storage I/O error (Web GUI) while
> the output of virsh list --all command shows that the HostedEngine is
> running.
>
> I tried to issue the gluster heal command (gluster volume heal engine) but
> nothing changed.
>
> I have the following questions:
>
> 1. Should I restart the glusterd service? Where from? Is it enough if the
> glusterd is restarted on one host or should it be restarted on the other
> two as well?
>
> 2. Should the node that was NonResponsive and came back, be rebooted or
> not? It seems alright now and in good health.
>
> 3. Should the HostedEngine be restored with engine-backup or is it not
> necessary?
>
> 4. Could the loss of the DNS server for the oVirt hosts lead to an
> unresponsive host?
> The nsswitch file on the ovirt hosts and engine, has the DNS defined as:
> hosts:  files dns myhostname
>
> 5. How can we recover/rectify the situation above?
>
> Thanks for your help,
> Maria Souvalioti
> ___
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/GO6S6GXRJWYZN5NZ5IFTNQ6SGNEB75WQ/
>


-- 

Sandro Bonazzola

MANAGER, SOFTWARE ENGINEERING, EMEA R RHV

Red Hat EMEA 

sbona...@redhat.com


*Red Hat respects your work life balance. Therefore there is no need to
answer this email out of your office hours.
*
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/F4CTPLCBZCOU5I27QV3BAUX6GTKWZ4VY/


[ovirt-users] Re: Gluster volume engine stuck in healing with 1 unsynched entry & HostedEngine paused

2021-03-01 Thread souvaliotimaria
Hello again, 

I am back with a brief description of the situation I am in, and questions 
about the recovery. 

oVirt environment: 4.3.5.2 Hyperconverged
GlusterFS: Replica 2 + Arbiter 1
GlusterFS volumes: data, engine, vmstore

The current situation is the following:

- The Cluster is in Global Maintenance.

- The volume engine is up with comment (in the Web GUI) : Up, unsynched 
entries, needs healing.

- The VM HostedEngine is paused due to a storage I/O error (Web GUI) while the 
output of virsh list --all command shows that the HostedEngine is running.

I tried to issue the gluster heal command (gluster volume heal engine) but 
nothing changed.

I have the following questions:

1. Should I restart the glusterd service? Where from? Is it enough if the 
glusterd is restarted on one host or should it be restarted on the other two as 
well?

2. Should the node that was NonResponsive and came back, be rebooted or not? It 
seems alright now and in good health.

3. Should the HostedEngine be restored with engine-backup or is it not 
necessary?

4. Could the loss of the DNS server for the oVirt hosts lead to an unresponsive 
host?
The nsswitch file on the ovirt hosts and engine, has the DNS defined as:
hosts:  files dns myhostname

5. How can we recover/rectify the situation above?

Thanks for your help,
Maria Souvalioti
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GO6S6GXRJWYZN5NZ5IFTNQ6SGNEB75WQ/