Re: [Gluster-users] GeoRep Faulty after Gluster 7 to 8 upgrade - gfchangelog: wrong result

2021-03-10 Thread Matthew Benstead
Thanks Strahil,

Right - I had come across your message in early January that v8 from the
CentOS Sig was missing the SELinux rules, and had put SELinux into
permissive mode after the upgrade when I saw denied messages in the
audit logs.

[root@storage01 ~]# sestatus | egrep "^SELinux status|[mM]ode"
SELinux status: enabled
Current mode:   permissive
Mode from config file:  enforcing

Yes - I am using an unprivileged user for georep: 

[root@pcic-backup01 ~]# gluster-mountbroker status
+-+-+---+--+--+
| NODE    | NODE STATUS | MOUNT ROOT    |    GROUP
|  USERS   |
+-+-+---+--+--+
| 10.0.231.82 |  UP | /var/mountbroker-root(OK) | geogroup(OK) |
geoaccount(pcic-backup)  |
|  localhost  |  UP | /var/mountbroker-root(OK) | geogroup(OK) |
geoaccount(pcic-backup)  |
+-+-+---+--+--+

[root@pcic-backup02 ~]# gluster-mountbroker status
+-+-+---+--+--+
| NODE    | NODE STATUS | MOUNT ROOT    |    GROUP
|  USERS   |
+-+-+---+--+--+
| 10.0.231.81 |  UP | /var/mountbroker-root(OK) | geogroup(OK) |
geoaccount(pcic-backup)  |
|  localhost  |  UP | /var/mountbroker-root(OK) | geogroup(OK) |
geoaccount(pcic-backup)  |
+-+-+---+--+--+

Thanks,
 -Matthew

--
Matthew Benstead
System Administrator
Pacific Climate Impacts Consortium 
University of Victoria, UH1
PO Box 1800, STN CSC
Victoria, BC, V8W 2Y2
Phone: +1-250-721-8432
Email: matth...@uvic.ca

On 3/10/21 2:11 PM, Strahil Nikolov wrote:
> Notice: This message was sent from outside the University of Victoria
> email system. Please be cautious with links and sensitive information.
>
> I have tested georep on v8.3 and it was running quite well untill you
> involve SELINUX.
>
> Are you using SELINUX ?
> Are you using unprivileged user for the georep ?
>
> Also, you can
> check 
> https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/sect-troubleshooting_geo-replication
> .
>
> Best Regards,
> Strahil Nikolov
>
> On Thu, Mar 11, 2021 at 0:03, Matthew Benstead
>  wrote:
> Hello,
>
> I recently upgraded my Distributed-Replicate cluster from Gluster
> 7.9 to 8.3 on CentOS7 using the CentOS Storage SIG packages. I had
> geo-replication syncing properly before the upgrade, but not it is
> not working after.
>
> After I had upgraded both master and slave clusters I attempted to
> start geo-replication again, but it goes to faulty quickly:
>
> [root@storage01 ~]# gluster volume geo-replication storage
> geoaccount@10.0.231.81::pcic-backup
>  start
> Starting geo-replication session between storage &
> geoaccount@10.0.231.81::pcic-backup
>  has been successful\
>  
> [root@storage01 ~]# gluster volume geo-replication status
>  
> MASTER NODE    MASTER VOL    MASTER BRICK   SLAVE
> USER    SLAVE    SLAVE NODE   
> STATUS    CRAWL STATUS    LAST_SYNCED 
> 
> -
> 10.0.231.91    storage   /data/storage_a/storage   
> geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup
>    
> N/A   Faulty    N/A N/A 
> 10.0.231.91    storage   /data/storage_c/storage   
> geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup
>    
> N/A   Faulty    N/A N/A 
> 10.0.231.91    storage   /data/storage_b/storage   
> geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup
>    
> N/A   Faulty    N/A N/A 
> 10.0.231.92    storage   /data/storage_b/storage   
> geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup
>    
> N/A   Faulty    N/A N/A 
> 10.0.231.92    storage   /data/storage_a/storage   
> geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup
> 

[Gluster-users] GeoRep Faulty after Gluster 7 to 8 upgrade - gfchangelog: wrong result

2021-03-10 Thread Matthew Benstead
Hello,

I recently upgraded my Distributed-Replicate cluster from Gluster 7.9 to
8.3 on CentOS7 using the CentOS Storage SIG packages. I had
geo-replication syncing properly before the upgrade, but not it is not
working after.

After I had upgraded both master and slave clusters I attempted to start
geo-replication again, but it goes to faulty quickly:

[root@storage01 ~]# gluster volume geo-replication storage
geoaccount@10.0.231.81::pcic-backup start
Starting geo-replication session between storage &
geoaccount@10.0.231.81::pcic-backup has been successful\
 
[root@storage01 ~]# gluster volume geo-replication status
 
MASTER NODE    MASTER VOL    MASTER BRICK   SLAVE USER   
SLAVE    SLAVE NODE    STATUS   
CRAWL STATUS    LAST_SYNCED 
-
10.0.231.91    storage   /data/storage_a/storage    geoaccount   
ssh://geoaccount@10.0.231.81::pcic-backup    N/A   Faulty   
N/A N/A 
10.0.231.91    storage   /data/storage_c/storage    geoaccount   
ssh://geoaccount@10.0.231.81::pcic-backup    N/A   Faulty   
N/A N/A 
10.0.231.91    storage   /data/storage_b/storage    geoaccount   
ssh://geoaccount@10.0.231.81::pcic-backup    N/A   Faulty   
N/A N/A 
10.0.231.92    storage   /data/storage_b/storage    geoaccount   
ssh://geoaccount@10.0.231.81::pcic-backup    N/A   Faulty   
N/A N/A 
10.0.231.92    storage   /data/storage_a/storage    geoaccount   
ssh://geoaccount@10.0.231.81::pcic-backup    N/A   Faulty   
N/A N/A 
10.0.231.92    storage   /data/storage_c/storage    geoaccount   
ssh://geoaccount@10.0.231.81::pcic-backup    N/A   Faulty   
N/A N/A 
10.0.231.93    storage   /data/storage_c/storage    geoaccount   
ssh://geoaccount@10.0.231.81::pcic-backup    N/A   Faulty   
N/A N/A 
10.0.231.93    storage   /data/storage_b/storage    geoaccount   
ssh://geoaccount@10.0.231.81::pcic-backup    N/A   Faulty   
N/A N/A 
10.0.231.93    storage   /data/storage_a/storage    geoaccount   
ssh://geoaccount@10.0.231.81::pcic-backup    N/A   Faulty   
N/A N/A 

[root@storage01 ~]# gluster volume geo-replication storage
geoaccount@10.0.231.81::pcic-backup stop
Stopping geo-replication session between storage &
geoaccount@10.0.231.81::pcic-backup has been successful


I went through the gsyncd logs and see it attempts to go back through
the changlogs - which would make sense - but fails:

[2021-03-10 19:18:42.165807] I
[gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker
Status Change [{status=Initializing...}]
[2021-03-10 19:18:42.166136] I [monitor(monitor):160:monitor] Monitor:
starting gsyncd worker [{brick=/data/storage_a/storage},
{slave_node=10.0.231.81}]
[2021-03-10 19:18:42.167829] I [monitor(monitor):160:monitor] Monitor:
starting gsyncd worker [{brick=/data/storage_c/storage},
{slave_node=10.0.231.82}]
[2021-03-10 19:18:42.172343] I
[gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker
Status Change [{status=Initializing...}]
[2021-03-10 19:18:42.172580] I [monitor(monitor):160:monitor] Monitor:
starting gsyncd worker [{brick=/data/storage_b/storage},
{slave_node=10.0.231.82}]
[2021-03-10 19:18:42.235574] I [resource(worker
/data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH
connection between master and slave...
[2021-03-10 19:18:42.236613] I [resource(worker
/data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH
connection between master and slave...
[2021-03-10 19:18:42.238614] I [resource(worker
/data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH
connection between master and slave...
[2021-03-10 19:18:44.144856] I [resource(worker
/data/storage_b/storage):1436:connect_remote] SSH: SSH connection
between master and slave established. [{duration=1.9059}]
[2021-03-10 19:18:44.145065] I [resource(worker
/data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume
locally...
[2021-03-10 19:18:44.162873] I [resource(worker
/data/storage_a/storage):1436:connect_remote] SSH: SSH connection
between master and slave established. [{duration=1.9259}]
[2021-03-10 19:18:44.163412] I [resource(worker
/data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume
locally...
[2021-03-10 19:18:44.167506] I [resource(worker
/data/storage_c/storage):1436:connect_remote] SSH: SSH connection
between master and slave established. [{duration=1.9316}]
[2021-03-10 19:18:44.167746] I [resource(worker
/data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume

[Gluster-users] Transport endpoint is not connected

2021-03-10 Thread Pat Haley



Hi,

We had a hardware error in one of our switches which cut-off the 
communications between one of our gluster brick nodes and the client 
nodes. By the time we had identified the problem and replaced the bad 
part, one of our clients started throwing up "Transport endpoint is not 
connected" errors. We are still getting these errors even though we have 
re-established the connection. Is there a simple way to clear this error 
besides rebooting the client system?


Thanks

--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:  pha...@mit.edu
Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Does not look like 8.4 is pushed to CentOS mirrors.

2021-03-10 Thread Claus Jeppesen
Hi Community,

Do we know if the GlusterFS builds will be pushed to Centos mirrors again ?
E.g. to
http://mirror.centos.org/centos/7/storage/ (or the CentOS 8 repo).

Thanx,

Claus.

-- 
*Claus Jeppesen*
Manager, Network Services
Datto, Inc.
p +45 6170 5901 | Copenhagen Office
www.datto.com




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Failed to populate loc for thin-arbiter

2021-03-10 Thread lejeczek




On 10/03/2021 04:10, Ravishankar N wrote:


On 09/03/21 11:43 pm, lejeczek wrote:

Hi guys,

I have a simple volume but which seems to suffer from 
some problems. (maybe all volumes in the cluster also)


...
[2021-03-09 17:59:08.195634] E [MSGID: 114058] 
[client-handshake.c:1455:client_query_portmap_cbk] 
0-USER-HOME-ta-2: failed to get the port number for 
remote subvolume. Please run 'gluster volume status' on 
server to see if brick process is running.
[2021-03-09 17:59:18.192257] E [MSGID: 108044] 
[afr-common.c:3067:afr_ta_id_file_check] 
0-USER-HOME-replicate-0: Failed to lookup/create 
thin-arbiter id file. [Transport endpoint is not connected]


It appears as if your fuse mount is not able to connect to 
the thin arbiter node. Is the glusterfsd process running 
on the thin-arbiter node?


Ideally, the ID file is created by the first client (fuse 
mount) that mounts and accesses a newly created thin 
arbiter volume for the first time.


Regards,

Ravi

Yes it seems even more puzzling to me. I can see that volume 
with 'arbiter' also misbehave while I though problem was 
only with 'thin-arbiter'
I do not suppose that have anything to do with 'samba' which 
runs 'shares' off glusterfs volumes?

On all the nodes involved in a volume I see in 'glustershd.log'
...

[2021-03-10 09:26:24.500574] I 
[rpc-clnt.c:1971:rpc_clnt_reconfig] 13-USER-HOME-client-1: 
changing port to 49157 (from 0)
[2021-03-10 09:26:27.506918] I 
[rpc-clnt.c:1971:rpc_clnt_reconfig] 13-USER-HOME-client-1: 
changing port to 49157 (from 0)
[2021-03-10 09:26:30.513109] I 
[rpc-clnt.c:1971:rpc_clnt_reconfig] 13-USER-HOME-client-1: 
changing port to 49157 (from 0)
[2021-03-10 09:26:33.521548] I 
[rpc-clnt.c:1971:rpc_clnt_reconfig] 13-USER-HOME-client-1: 
changing port to 49157 (from 0)
[2021-03-10 09:29:10.487079] I 
[socket.c:865:__socket_shutdown] 2-DATA-ta-2: intentional 
socket shutdown(5)
[2021-03-10 09:29:10.490647] I 
[socket.c:865:__socket_shutdown] 4-WORK-ta-2: intentional 
socket shutdown(5)

...

That is a freshly re/created volumes. 'peer status' shows no 
issues.
In terms of 'networking' also I do find any issues, only 
thing I wonder of is - do I, or rather glusterfs, must have 
hostnames in DNS or 'etc/hosts' only is perfectly fine.


Again, suggestions on how to troubleshoot this are greatly 
appreciated.

thanks, L.


[2021-03-09 17:59:29.125268] E [MSGID: 108044] 
[afr-transaction.c:1151:afr_ta_post_op_do] 
0-USER-HOME-replicate-0: Failed to populate loc for 
thin-arbiter. [Invalid argument]
[2021-03-09 17:59:29.125366] E [MSGID: 108044] 
[afr-transaction.c:763:afr_changelog_post_op_fail] 
0-USER-HOME-replicate-0: Failing CREATE for gfid 
----. Fop state is:1 
[Invalid argument]
[2021-03-09 17:59:29.235173] E [MSGID: 108006] 
[afr-common.c:5309:__afr_handle_child_down_event] 
0-USER-HOME-replicate-0: All subvolumes are down. Going 
offline until at least one of them comes back up.
[2021-03-09 18:02:39.737104] E [MSGID: 114058] 
[client-handshake.c:1455:client_query_portmap_cbk] 
0-USER-HOME-ta-2: failed to get the port number for 
remote subvolume. Please run 'gluster volume status' on 
server to see if brick process is running.

...

Any suggestions as to what can be the problem much 
appreciated.

many thanks, L.




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users









Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users