Unfortunately, no success. I did the following: - gluster nfs-ganesha disable first time the request timed out, after a reboot of the server tried the same command again and it succeeded - /usr/libexec/ganesha/ganesha-ha.sh --cleanup /etc/ganesha no output - gluster nfs-ganesha enable Again timeout and corosync is unresponsive and using 100% CPU. Had to do kill -9 on the process. Same messages in the log as previous (Corosync in failed state).
Does the ganesha-ha.sh script handle multiple network interfaces? There are two interfaces on both servers and corosync/pacemaker should use only one of them. On 22 September 2015 at 21:44, Tiemen Ruiten <[email protected]> wrote: > Allright, thank you Soumya. I actually did do the cleanup every time > (gluster nfs-ganesha disable), but it didn't always finish succesfully. > Sometimes it would just time out. I'll try with the second command tomorrow. > > Good to know that it should work with two nodes as well. > > On 22 September 2015 at 19:26, Soumya Koduri <[email protected]> wrote: > >> >> >> On 09/22/2015 05:06 PM, Tiemen Ruiten wrote: >> >>> That's correct and my original question was actually if a two node + >>> arbiter setup is possible. The documentation provided by Soumya only >>> mentions two servers in the example ganesha-ha.sh script. Perhaps that >>> could be updated as well then, to not give the wrong impression. >>> >>> It does work with 2-node as well. In the script, there is already a >> check to verify if the number of servers < 3, it automatically disables >> quorum. >> Quorum cannot be enabled for a 2-node setup for obvious reasons. If one >> node fails, other node just takes over the IP. >> >> Thanks, >> Soumya >> >> I could try to change the script to disable quorum, but wouldn't that >>> defeat the purpose? What will happen in case one node goes down >>> unexpectedly? >>> >>> On 22 September 2015 at 12:47, Kaleb Keithley <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> >>> Hi, >>> >>> IIRC, the setup is two nodes gluster+ganesha nodes plus the arbiter >>> node for gluster quorum. >>> >>> Have I remembered that correctly? >>> >>> The Ganesha HA in 3.7 requires a minimum of three servers running >>> ganesha and pacemaker. Two might work if you change the >>> ganesha-ha.sh to not enable pacemaker quorum, but I haven't tried >>> that myself. I'll try and find time in the next couple of days to >>> update the documentation or write a blog post. >>> >>> >>> >>> ----- Original Message ---- >>> > >>> > >>> > >>> > On 21/09/15 21:21, Tiemen Ruiten wrote: >>> > > Whoops, replied off-list. >>> > > >>> > > Additionally I noticed that the generated corosync config is not >>> > > valid, as there is no interface section: >>> > > >>> > > /etc/corosync/corosync.conf >>> > > >>> > > totem { >>> > > version: 2 >>> > > secauth: off >>> > > cluster_name: rd-ganesha-ha >>> > > transport: udpu >>> > > } >>> > > >>> > > nodelist { >>> > > node { >>> > > ring0_addr: cobalt >>> > > nodeid: 1 >>> > > } >>> > > node { >>> > > ring0_addr: iron >>> > > nodeid: 2 >>> > > } >>> > > } >>> > > >>> > > quorum { >>> > > provider: corosync_votequorum >>> > > two_node: 1 >>> > > } >>> > > >>> > > logging { >>> > > to_syslog: yes >>> > > } >>> > > >>> > > >>> > > >>> > >>> > May be Kaleb can help you out. >>> > > >>> > > ---------- Forwarded message ---------- >>> > > From: *Tiemen Ruiten* <[email protected] <mailto: >>> [email protected]> >>> <mailto:[email protected] <mailto:[email protected]>>> >>> > > Date: 21 September 2015 at 17:16 >>> > > Subject: Re: [Gluster-users] nfs-ganesha HA with arbiter volume >>> > > To: Jiffin Tony Thottan <[email protected] >>> <mailto:[email protected]> <mailto:[email protected] >>> <mailto:[email protected]>>> >>> > > >>> > > >>> > > Could you point me to the latest documentation? I've been >>> struggling >>> > > to find something up-to-date. I believe I have all the >>> prerequisites: >>> > > >>> > > - shared storage volume exists and is mounted >>> > > - all nodes in hosts files >>> > > - Gluster-NFS disabled >>> > > - corosync, pacemaker and nfs-ganesha rpm's installed >>> > > >>> > > Anything I missed? >>> > > >>> > > Everything has been installed by RPM so is in the default >>> locations: >>> > > /usr/libexec/ganesha/ganesha-ha.sh >>> > > /etc/ganesha/ganesha.conf (empty) >>> > > /etc/ganesha/ganesha-ha.conf >>> > > >>> > >>> > Looks fine for me. >>> > >>> > > After I started the pcsd service manually, nfs-ganesha could be >>> > > enabled successfully, but there was no virtual IP present on the >>> > > interfaces and looking at the system log, I noticed corosync >>> failed to >>> > > start: >>> > > >>> > > - on the host where I issued the gluster nfs-ganesha enable >>> command: >>> > > >>> > > Sep 21 17:07:18 iron systemd: Starting NFS-Ganesha file >>> server... >>> > > Sep 21 17:07:19 iron systemd: Started NFS-Ganesha file server. >>> > > Sep 21 17:07:19 iron rpc.statd[2409]: Received SM_UNMON_ALL >>> request >>> > > from iron.int.rdmedia.com <http://iron.int.rdmedia.com> >>> <http://iron.int.rdmedia.com> while not >>> > > monitoring any hosts >>> > > Sep 21 17:07:20 iron systemd: Starting Corosync Cluster >>> Engine... >>> > > Sep 21 17:07:20 iron corosync[3426]: [MAIN ] Corosync Cluster >>> Engine >>> > > ('2.3.4'): started and ready to provide service. >>> > > Sep 21 17:07:20 iron corosync[3426]: [MAIN ] Corosync built-in >>> > > features: dbus systemd xmlconf snmp pie relro bindnow >>> > > Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] Initializing >>> transport >>> > > (UDP/IP Unicast). >>> > > Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] Initializing >>> > > transmit/receive security (NSS) crypto: none hash: none >>> > > Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] The network >>> interface >>> > > [10.100.30.38] is now up. >>> > > Sep 21 17:07:20 iron corosync[3427]: [SERV ] Service engine >>> loaded: >>> > > corosync configuration map access [0] >>> > > Sep 21 17:07:20 iron corosync[3427]: [QB ] server name: cmap >>> > > Sep 21 17:07:20 iron corosync[3427]: [SERV ] Service engine >>> loaded: >>> > > corosync configuration service [1] >>> > > Sep 21 17:07:20 iron corosync[3427]: [QB ] server name: cfg >>> > > Sep 21 17:07:20 iron corosync[3427]: [SERV ] Service engine >>> loaded: >>> > > corosync cluster closed process group service v1.01 [2] >>> > > Sep 21 17:07:20 iron corosync[3427]: [QB ] server name: cpg >>> > > Sep 21 17:07:20 iron corosync[3427]: [SERV ] Service engine >>> loaded: >>> > > corosync profile loading service [4] >>> > > Sep 21 17:07:20 iron corosync[3427]: [QUORUM] Using quorum >>> provider >>> > > corosync_votequorum >>> > > Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all >>> cluster >>> > > members. Current votes: 1 expected_votes: 2 >>> > > Sep 21 17:07:20 iron corosync[3427]: [SERV ] Service engine >>> loaded: >>> > > corosync vote quorum service v1.0 [5] >>> > > Sep 21 17:07:20 iron corosync[3427]: [QB ] server name: >>> votequorum >>> > > Sep 21 17:07:20 iron corosync[3427]: [SERV ] Service engine >>> loaded: >>> > > corosync cluster quorum service v0.1 [3] >>> > > Sep 21 17:07:20 iron corosync[3427]: [QB ] server name: >>> quorum >>> > > Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] adding new UDPU >>> member >>> > > {10.100.30.38} >>> > > Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] adding new UDPU >>> member >>> > > {10.100.30.37} >>> > > Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] A new membership >>> > > (10.100.30.38:104 <http://10.100.30.38:104> >>> <http://10.100.30.38:104>) was formed. Members joined: 1 >>> > > Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all >>> cluster >>> > > members. Current votes: 1 expected_votes: 2 >>> > > Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all >>> cluster >>> > > members. Current votes: 1 expected_votes: 2 >>> > > Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all >>> cluster >>> > > members. Current votes: 1 expected_votes: 2 >>> > > Sep 21 17:07:20 iron corosync[3427]: [QUORUM] Members[1]: 1 >>> > > Sep 21 17:07:20 iron corosync[3427]: [MAIN ] Completed service >>> > > synchronization, ready to provide service. >>> > > Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] A new membership >>> > > (10.100.30.37:108 <http://10.100.30.37:108> >>> <http://10.100.30.37:108>) was formed. Members joined: 1 >>> > > Sep 21 17:08:21 iron corosync: Starting Corosync Cluster Engine >>> > > (corosync): [FAILED] >>> > > Sep 21 17:08:21 iron systemd: corosync.service: control process >>> > > exited, code=exited status=1 >>> > > Sep 21 17:08:21 iron systemd: Failed to start Corosync Cluster >>> Engine. >>> > > Sep 21 17:08:21 iron systemd: Unit corosync.service entered >>> failed state. >>> > > >>> > > >>> > > - on the other host: >>> > > >>> > > Sep 21 17:07:19 cobalt systemd: Starting Preprocess NFS >>> configuration... >>> > > Sep 21 17:07:19 cobalt systemd: Starting RPC Port Mapper. >>> > > Sep 21 17:07:19 cobalt systemd: Reached target RPC Port Mapper. >>> > > Sep 21 17:07:19 cobalt systemd: Starting Host and Network Name >>> Lookups. >>> > > Sep 21 17:07:19 cobalt systemd: Reached target Host and Network >>> Name >>> > > Lookups. >>> > > Sep 21 17:07:19 cobalt systemd: Starting RPC bind service... >>> > > Sep 21 17:07:19 cobalt systemd: Started Preprocess NFS >>> configuration. >>> > > Sep 21 17:07:19 cobalt systemd: Started RPC bind service. >>> > > Sep 21 17:07:19 cobalt systemd: Starting NFS status monitor for >>> > > NFSv2/3 locking.... >>> > > Sep 21 17:07:19 cobalt rpc.statd[2662]: Version 1.3.0 starting >>> > > Sep 21 17:07:19 cobalt rpc.statd[2662]: Flags: TI-RPC >>> > > Sep 21 17:07:19 cobalt systemd: Started NFS status monitor for >>> NFSv2/3 >>> > > locking.. >>> > > Sep 21 17:07:19 cobalt systemd: Starting NFS-Ganesha file >>> server... >>> > > Sep 21 17:07:19 cobalt systemd: Started NFS-Ganesha file server. >>> > > Sep 21 17:07:19 cobalt kernel: warning: `ganesha.nfsd' uses >>> 32-bit >>> > > capabilities (legacy support in use) >>> > > Sep 21 17:07:19 cobalt logger: setting up rd-ganesha-ha >>> > > Sep 21 17:07:19 cobalt rpc.statd[2662]: Received SM_UNMON_ALL >>> request >>> > > from cobalt.int.rdmedia.com <http://cobalt.int.rdmedia.com> >>> <http://cobalt.int.rdmedia.com> while not >>> > > monitoring any hosts >>> > > Sep 21 17:07:19 cobalt logger: setting up cluster rd-ganesha-ha >>> with >>> > > the following cobalt iron >>> > > Sep 21 17:07:20 cobalt systemd: Stopped Pacemaker High >>> Availability >>> > > Cluster Manager. >>> > > Sep 21 17:07:20 cobalt systemd: Stopped Corosync Cluster Engine. >>> > > Sep 21 17:07:20 cobalt systemd: Reloading. >>> > > Sep 21 17:07:20 cobalt systemd: >>> > > [/usr/lib/systemd/system/dm-event.socket:10] Unknown lvalue >>> > > 'RemoveOnStop' in section 'Socket' >>> > > Sep 21 17:07:20 cobalt systemd: >>> > > [/usr/lib/systemd/system/lvm2-lvmetad.socket:9] Unknown lvalue >>> > > 'RemoveOnStop' in section 'Socket' >>> > > Sep 21 17:07:20 cobalt systemd: Reloading. >>> > > Sep 21 17:07:20 cobalt systemd: >>> > > [/usr/lib/systemd/system/dm-event.socket:10] Unknown lvalue >>> > > 'RemoveOnStop' in section 'Socket' >>> > > Sep 21 17:07:20 cobalt systemd: >>> > > [/usr/lib/systemd/system/lvm2-lvmetad.socket:9] Unknown lvalue >>> > > 'RemoveOnStop' in section 'Socket' >>> > > Sep 21 17:07:20 cobalt systemd: Starting Corosync Cluster >>> Engine... >>> > > Sep 21 17:07:20 cobalt corosync[2816]: [MAIN ] Corosync Cluster >>> > > Engine ('2.3.4'): started and ready to provide service. >>> > > Sep 21 17:07:20 cobalt corosync[2816]: [MAIN ] Corosync >>> built-in >>> > > features: dbus systemd xmlconf snmp pie relro bindnow >>> > > Sep 21 17:07:20 cobalt corosync[2817]: [TOTEM ] Initializing >>> transport >>> > > (UDP/IP Unicast). >>> > > Sep 21 17:07:20 cobalt corosync[2817]: [TOTEM ] Initializing >>> > > transmit/receive security (NSS) crypto: none hash: none >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] The network >>> interface >>> > > [10.100.30.37] is now up. >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [SERV ] Service engine >>> loaded: >>> > > corosync configuration map access [0] >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [QB ] server name: >>> cmap >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [SERV ] Service engine >>> loaded: >>> > > corosync configuration service [1] >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [QB ] server name: cfg >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [SERV ] Service engine >>> loaded: >>> > > corosync cluster closed process group service v1.01 [2] >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [QB ] server name: cpg >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [SERV ] Service engine >>> loaded: >>> > > corosync profile loading service [4] >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Using quorum >>> provider >>> > > corosync_votequorum >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all >>> > > cluster members. Current votes: 1 expected_votes: 2 >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [SERV ] Service engine >>> loaded: >>> > > corosync vote quorum service v1.0 [5] >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [QB ] server name: >>> votequorum >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [SERV ] Service engine >>> loaded: >>> > > corosync cluster quorum service v0.1 [3] >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [QB ] server name: >>> quorum >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] adding new UDPU >>> member >>> > > {10.100.30.37} >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] adding new UDPU >>> member >>> > > {10.100.30.38} >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] A new membership >>> > > (10.100.30.37:100 <http://10.100.30.37:100> >>> <http://10.100.30.37:100>) was formed. Members joined: 1 >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all >>> > > cluster members. Current votes: 1 expected_votes: 2 >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all >>> > > cluster members. Current votes: 1 expected_votes: 2 >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all >>> > > cluster members. Current votes: 1 expected_votes: 2 >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Members[1]: 1 >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [MAIN ] Completed service >>> > > synchronization, ready to provide service. >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] A new membership >>> > > (10.100.30.37:108 <http://10.100.30.37:108> >>> <http://10.100.30.37:108>) was formed. Members joined: 1 >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all >>> > > cluster members. Current votes: 1 expected_votes: 2 >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Members[1]: 1 >>> > > Sep 21 17:07:21 cobalt corosync[2817]: [MAIN ] Completed >>> service >>> > > synchronization, ready to provide service. >>> > > Sep 21 17:08:50 cobalt systemd: corosync.service operation >>> timed out. >>> > > Terminating. >>> > > Sep 21 17:08:50 cobalt corosync: Starting Corosync Cluster >>> Engine >>> > > (corosync): >>> > > Sep 21 17:08:50 cobalt systemd: Failed to start Corosync >>> Cluster Engine. >>> > > Sep 21 17:08:50 cobalt systemd: Unit corosync.service entered >>> failed >>> > > state. >>> > > Sep 21 17:08:55 cobalt logger: warning: pcs property set >>> > > no-quorum-policy=ignore failed >>> > > Sep 21 17:08:55 cobalt logger: warning: pcs property set >>> > > stonith-enabled=false failed >>> > > Sep 21 17:08:55 cobalt logger: warning: pcs resource create >>> nfs_start >>> > > ganesha_nfsd ha_vol_mnt=/var/run/gluster/shared_storage --clone >>> failed >>> > > Sep 21 17:08:56 cobalt logger: warning: pcs resource delete >>> > > nfs_start-clone failed >>> > > Sep 21 17:08:56 cobalt logger: warning: pcs resource create >>> nfs-mon >>> > > ganesha_mon --clone failed >>> > > Sep 21 17:08:56 cobalt logger: warning: pcs resource create >>> nfs-grace >>> > > ganesha_grace --clone failed >>> > > Sep 21 17:08:57 cobalt logger: warning pcs resource create >>> > > cobalt-cluster_ip-1 ocf:heartbeat:IPaddr ip= cidr_netmask=32 op >>> > > monitor interval=15s failed >>> > > Sep 21 17:08:57 cobalt logger: warning: pcs resource create >>> > > cobalt-trigger_ip-1 ocf:heartbeat:Dummy failed >>> > > Sep 21 17:08:57 cobalt logger: warning: pcs constraint >>> colocation add >>> > > cobalt-cluster_ip-1 with cobalt-trigger_ip-1 failed >>> > > Sep 21 17:08:57 cobalt logger: warning: pcs constraint order >>> > > cobalt-trigger_ip-1 then nfs-grace-clone failed >>> > > Sep 21 17:08:57 cobalt logger: warning: pcs constraint order >>> > > nfs-grace-clone then cobalt-cluster_ip-1 failed >>> > > Sep 21 17:08:57 cobalt logger: warning pcs resource create >>> > > iron-cluster_ip-1 ocf:heartbeat:IPaddr ip= cidr_netmask=32 op >>> monitor >>> > > interval=15s failed >>> > > Sep 21 17:08:57 cobalt logger: warning: pcs resource create >>> > > iron-trigger_ip-1 ocf:heartbeat:Dummy failed >>> > > Sep 21 17:08:57 cobalt logger: warning: pcs constraint >>> colocation add >>> > > iron-cluster_ip-1 with iron-trigger_ip-1 failed >>> > > Sep 21 17:08:57 cobalt logger: warning: pcs constraint order >>> > > iron-trigger_ip-1 then nfs-grace-clone failed >>> > > Sep 21 17:08:58 cobalt logger: warning: pcs constraint order >>> > > nfs-grace-clone then iron-cluster_ip-1 failed >>> > > Sep 21 17:08:58 cobalt logger: warning: pcs constraint location >>> > > cobalt-cluster_ip-1 rule score=-INFINITY ganesha-active ne 1 >>> failed >>> > > Sep 21 17:08:58 cobalt logger: warning: pcs constraint location >>> > > cobalt-cluster_ip-1 prefers iron=1000 failed >>> > > Sep 21 17:08:58 cobalt logger: warning: pcs constraint location >>> > > cobalt-cluster_ip-1 prefers cobalt=2000 failed >>> > > Sep 21 17:08:58 cobalt logger: warning: pcs constraint location >>> > > iron-cluster_ip-1 rule score=-INFINITY ganesha-active ne 1 >>> failed >>> > > Sep 21 17:08:58 cobalt logger: warning: pcs constraint location >>> > > iron-cluster_ip-1 prefers cobalt=1000 failed >>> > > Sep 21 17:08:58 cobalt logger: warning: pcs constraint location >>> > > iron-cluster_ip-1 prefers iron=2000 failed >>> > > Sep 21 17:08:58 cobalt logger: warning pcs cluster cib-push >>> > > /tmp/tmp.nXTfyA1GMR failed >>> > > Sep 21 17:08:58 cobalt logger: warning: scp ganesha-ha.conf to >>> cobalt >>> > > failed >>> > > >>> > > BTW, I'm using CentOS 7. There are multiple network interfaces >>> on the >>> > > servers, could that be a problem? >>> > > >>> > > >>> > > >>> > > >>> > > On 21 September 2015 at 11:48, Jiffin Tony Thottan >>> > > <[email protected] <mailto:[email protected]> >>> <mailto:[email protected] <mailto:[email protected]>>> wrote: >>> > > >>> > > >>> > > >>> > > On 21/09/15 13:56, Tiemen Ruiten wrote: >>> > >> Hello Soumya, Kaleb, list, >>> > >> >>> > >> This Friday I created the gluster_shared_storage volume >>> manually, >>> > >> I just tried it with the command you supplied, but both >>> have the >>> > >> same result: >>> > >> >>> > >> from etc-glusterfs-glusterd.vol.log on the node where I >>> issued >>> > >> the command: >>> > >> >>> > >> [2015-09-21 07:59:47.756845] I [MSGID: 106474] >>> > >> [glusterd-ganesha.c:403:check_host_list] 0-management: >>> ganesha >>> > >> host found Hostname is cobalt >>> > >> [2015-09-21 07:59:48.071755] I [MSGID: 106474] >>> > >> [glusterd-ganesha.c:349:is_ganesha_host] 0-management: >>> ganesha >>> > >> host found Hostname is cobalt >>> > >> [2015-09-21 07:59:48.653879] E [MSGID: 106470] >>> > >> [glusterd-ganesha.c:264:glusterd_op_set_ganesha] >>> 0-management: >>> > >> Initial NFS-Ganesha set up failed >>> > > >>> > > As far as what I understand from the logs, it called >>> > > setup_cluser()[calls `ganesha-ha.sh` script ] but script >>> failed. >>> > > Can u please provide following details : >>> > > -Location of ganesha.sh file?? >>> > > -Location of ganesha-ha.conf, ganesha.conf files ? >>> > > >>> > > >>> > > And also can u cross check whether all the prerequisites >>> before HA >>> > > setup satisfied ? >>> > > >>> > > -- >>> > > With Regards, >>> > > Jiffin >>> > > >>> > > >>> > >> [2015-09-21 07:59:48.653912] E [MSGID: 106123] >>> > >> [glusterd-syncop.c:1404:gd_commit_op_phase] 0-management: >>> Commit >>> > >> of operation 'Volume (null)' failed on localhost : Failed >>> to set >>> > >> up HA config for NFS-Ganesha. Please check the log file >>> for details >>> > >> [2015-09-21 07:59:45.402458] I [MSGID: 106006] >>> > >> [glusterd-svc-mgmt.c:323:glusterd_svc_common_rpc_notify] >>> > >> 0-management: nfs has disconnected from glusterd. >>> > >> [2015-09-21 07:59:48.071578] I [MSGID: 106474] >>> > >> [glusterd-ganesha.c:403:check_host_list] 0-management: >>> ganesha >>> > >> host found Hostname is cobalt >>> > >> >>> > >> from etc-glusterfs-glusterd.vol.log on the other node: >>> > >> >>> > >> [2015-09-21 08:12:50.111877] E [MSGID: 106062] >>> > >> [glusterd-op-sm.c:3698:glusterd_op_ac_unlock] 0-management: >>> > >> Unable to acquire volname >>> > >> [2015-09-21 08:14:50.548087] E [MSGID: 106062] >>> > >> [glusterd-op-sm.c:3635:glusterd_op_ac_lock] 0-management: >>> Unable >>> > >> to acquire volname >>> > >> [2015-09-21 08:14:50.654746] I [MSGID: 106132] >>> > >> [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: >>> nfs >>> > >> already stopped >>> > >> [2015-09-21 08:14:50.655095] I [MSGID: 106474] >>> > >> [glusterd-ganesha.c:403:check_host_list] 0-management: >>> ganesha >>> > >> host found Hostname is cobalt >>> > >> [2015-09-21 08:14:51.287156] E [MSGID: 106062] >>> > >> [glusterd-op-sm.c:3698:glusterd_op_ac_unlock] 0-management: >>> > >> Unable to acquire volname >>> > >> >>> > >> >>> > >> from etc-glusterfs-glusterd.vol.log on the arbiter node: >>> > >> >>> > >> [2015-09-21 08:18:50.934713] E [MSGID: 101075] >>> > >> [common-utils.c:3127:gf_is_local_addr] 0-management: error >>> in >>> > >> getaddrinfo: Name or service not known >>> > >> [2015-09-21 08:18:51.504694] E [MSGID: 106062] >>> > >> [glusterd-op-sm.c:3698:glusterd_op_ac_unlock] 0-management: >>> > >> Unable to acquire volname >>> > >> >>> > >> I have put the hostnames of all servers in my /etc/hosts >>> file, >>> > >> including the arbiter node. >>> > >> >>> > >> >>> > >> On 18 September 2015 at 16:52, Soumya Koduri >>> <[email protected] <mailto:[email protected]> >>> > >> <mailto:[email protected] <mailto:[email protected]>>> >>> >>> wrote: >>> > >> >>> > >> Hi Tiemen, >>> > >> >>> > >> One of the pre-requisites before setting up >>> nfs-ganesha HA is >>> > >> to create and mount shared_storage volume. Use below >>> CLI for that >>> > >> >>> > >> "gluster volume set all cluster.enable-shared-storage >>> enable" >>> > >> >>> > >> It shall create the volume and mount in all the nodes >>> > >> (including the arbiter node). Note this volume shall be >>> > >> mounted on all the nodes of the gluster storage pool >>> (though >>> > >> in this case it may not be part of nfs-ganesha >>> cluster). >>> > >> >>> > >> So instead of manually creating those directory paths, >>> please >>> > >> use above CLI and try re-configuring the setup. >>> > >> >>> > >> Thanks, >>> > >> Soumya >>> > >> >>> > >> On 09/18/2015 07:29 PM, Tiemen Ruiten wrote: >>> > >> >>> > >> Hello Kaleb, >>> > >> >>> > >> I don't: >>> > >> >>> > >> # Name of the HA cluster created. >>> > >> # must be unique within the subnet >>> > >> HA_NAME="rd-ganesha-ha" >>> > >> # >>> > >> # The gluster server from which to mount the >>> shared data >>> > >> volume. >>> > >> HA_VOL_SERVER="iron" >>> > >> # >>> > >> # N.B. you may use short names or long names; you >>> may not >>> > >> use IP addrs. >>> > >> # Once you select one, stay with it as it will be >>> mildly >>> > >> unpleasant to >>> > >> # clean up if you switch later on. Ensure that all >>> names >>> > >> - short and/or >>> > >> # long - are in DNS or /etc/hosts on all machines >>> in the >>> > >> cluster. >>> > >> # >>> > >> # The subset of nodes of the Gluster Trusted Pool >>> that >>> > >> form the ganesha >>> > >> # HA cluster. Hostname is specified. >>> > >> HA_CLUSTER_NODES="cobalt,iron" >>> > >> #HA_CLUSTER_NODES="server1.lab.redhat.com >>> <http://server1.lab.redhat.com> >>> > >> <http://server1.lab.redhat.com> >>> > >> >>> <http://server1.lab.redhat.com>,server2.lab.redhat.com >>> <http://server2.lab.redhat.com> >>> > >> <http://server2.lab.redhat.com> >>> > >> <http://server2.lab.redhat.com>,..." >>> > >> # >>> > >> # Virtual IPs for each of the nodes specified >>> above. >>> > >> VIP_server1="10.100.30.101" >>> > >> VIP_server2="10.100.30.102" >>> > >> #VIP_server1_lab_redhat_com="10.0.2.1" >>> > >> #VIP_server2_lab_redhat_com="10.0.2.2" >>> > >> >>> > >> hosts cobalt & iron are the data nodes, the arbiter >>> > >> ip/hostname (neon) >>> > >> isn't mentioned anywhere in this config file. >>> > >> >>> > >> >>> > >> On 18 September 2015 at 15:56, Kaleb S. KEITHLEY >>> > >> <[email protected] <mailto:[email protected]> >>> <mailto:[email protected] <mailto:[email protected]>> >>> > >> <mailto:[email protected] >>> <mailto:[email protected]> >>> > >> <mailto:[email protected] >>> <mailto:[email protected]>>>> wrote: >>> > >> >>> > >> On 09/18/2015 09:46 AM, Tiemen Ruiten wrote: >>> > >> > Hello, >>> > >> > >>> > >> > I have a Gluster cluster with a single >>> replica 3, >>> > >> arbiter 1 volume (so >>> > >> > two nodes with actual data, one arbiter >>> node). I >>> > >> would like to setup >>> > >> > NFS-Ganesha HA for this volume but I'm >>> having some >>> > >> difficulties. >>> > >> > >>> > >> > - I needed to create a directory >>> > >> /var/run/gluster/shared_storage >>> > >> > manually on all nodes, or the command >>> 'gluster >>> > >> nfs-ganesha enable would >>> > >> > fail with the following error: >>> > >> > [2015-09-18 13:13:34.690416] E [MSGID: >>> 106032] >>> > >> > [glusterd-ganesha.c:708:pre_setup] >>> 0-THIS->name: >>> > >> mkdir() failed on path >>> > >> > /var/run/gluster/shared_storage/nfs-ganesha, >>> [No >>> > >> such file or directory] >>> > >> > >>> > >> > - Then I found out that the command connects >>> to the >>> > >> arbiter node as >>> > >> > well, but obviously I don't want to set up >>> > >> NFS-Ganesha there. Is it >>> > >> > actually possible to setup NFS-Ganesha HA >>> with an >>> > >> arbiter node? If it's >>> > >> > possible, is there any documentation on how >>> to do that? >>> > >> > >>> > >> >>> > >> Please send the /etc/ganesha/ganesha-ha.conf >>> file >>> > >> you're using. >>> > >> >>> > >> Probably you have included the arbiter in your >>> HA >>> > >> config; that would be >>> > >> a mistake. >>> > >> >>> > >> -- >>> > >> >>> > >> Kaleb >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> -- >>> > >> Tiemen Ruiten >>> > >> Systems Engineer >>> > >> R&D Media >>> > >> >>> > >> >>> > >> _______________________________________________ >>> > >> Gluster-users mailing list >>> > >> [email protected] <mailto:[email protected]> >>> <mailto:[email protected] <mailto:[email protected] >>> >> >>> > >>http://www.gluster.org/mailman/listinfo/gluster-users >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> -- >>> > >> Tiemen Ruiten >>> > >> Systems Engineer >>> > >> R&D Media >>> > >> >>> > >> >>> > >> _______________________________________________ >>> > >> Gluster-users mailing list >>> > >> [email protected] <mailto:[email protected]> >>> <mailto:[email protected] <mailto:[email protected] >>> >> >>> > >>http://www.gluster.org/mailman/listinfo/gluster-users >>> > > >>> > > >>> > > _______________________________________________ >>> > > Gluster-users mailing list >>> > > [email protected] <mailto:[email protected]> >>> <mailto:[email protected] <mailto:[email protected] >>> >> >>> >>> > >http://www.gluster.org/mailman/listinfo/gluster-users >>> > > >>> > > >>> > > >>> > > >>> > > -- >>> > > Tiemen Ruiten >>> > > Systems Engineer >>> > > R&D Media >>> > > >>> > > >>> > > >>> > > -- >>> > > Tiemen Ruiten >>> > > Systems Engineer >>> > > R&D Media >>> > > >>> > > >>> > > _______________________________________________ >>> > > Gluster-users mailing list >>> > >[email protected] <mailto:[email protected]> >>> > >http://www.gluster.org/mailman/listinfo/gluster-users >>> > >>> > >>> _______________________________________________ >>> Gluster-users mailing list >>> [email protected] <mailto:[email protected]> >>> http://www.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> >>> >>> -- >>> Tiemen Ruiten >>> Systems Engineer >>> R&D Media >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> [email protected] >>> http://www.gluster.org/mailman/listinfo/gluster-users >>> >>> > > > -- > Tiemen Ruiten > Systems Engineer > R&D Media > -- Tiemen Ruiten Systems Engineer R&D Media
_______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
