Re: [Gluster-users] Fwd: nfs-ganesha HA with arbiter volume

Tiemen Ruiten Wed, 23 Sep 2015 03:15:36 -0700

Unfortunately, no success. I did the following:

- gluster nfs-ganesha disable
first time the request timed out, after a reboot of the server tried the
same command again and it succeeded
- /usr/libexec/ganesha/ganesha-ha.sh --cleanup /etc/ganesha
no output
- gluster nfs-ganesha enable
Again timeout and corosync is unresponsive and using 100% CPU. Had to do
kill -9 on the process.
Same messages in the log as previous (Corosync in failed state).


Does the ganesha-ha.sh script handle multiple network interfaces? There are
two interfaces on both servers and corosync/pacemaker should use only one
of them.

On 22 September 2015 at 21:44, Tiemen Ruiten <[email protected]> wrote:

> Allright, thank you Soumya. I actually did do the cleanup every time
> (gluster nfs-ganesha disable), but it didn't always finish succesfully.
> Sometimes it would just time out. I'll try with the second command tomorrow.
>
> Good to know that it should work with two nodes as well.
>
> On 22 September 2015 at 19:26, Soumya Koduri <[email protected]> wrote:
>
>>
>>
>> On 09/22/2015 05:06 PM, Tiemen Ruiten wrote:
>>
>>> That's correct and my original question was actually if a two node +
>>> arbiter setup is possible. The documentation provided by Soumya only
>>> mentions two servers in the example ganesha-ha.sh script. Perhaps that
>>> could be updated as well then, to not give the wrong impression.
>>>
>>> It does work with 2-node as well. In the script, there is already a
>> check to verify if the number of servers < 3, it automatically disables
>> quorum.
>> Quorum cannot be enabled for a 2-node setup for obvious reasons. If one
>> node fails, other node just takes over the IP.
>>
>> Thanks,
>> Soumya
>>
>> I could try to change the script to disable quorum, but wouldn't that
>>> defeat the purpose? What will happen in case one node goes down
>>> unexpectedly?
>>>
>>> On 22 September 2015 at 12:47, Kaleb Keithley <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>>
>>>     Hi,
>>>
>>>     IIRC, the setup is two nodes gluster+ganesha nodes plus the arbiter
>>>     node for gluster quorum.
>>>
>>>     Have I remembered that correctly?
>>>
>>>     The Ganesha HA in 3.7 requires a minimum of three servers running
>>>     ganesha and pacemaker. Two might work if you change the
>>>     ganesha-ha.sh to not enable pacemaker quorum, but I haven't tried
>>>     that myself. I'll try and find time in the next couple of days to
>>>     update the documentation or write a blog post.
>>>
>>>
>>>
>>>     ----- Original Message ----
>>>      >
>>>      >
>>>      >
>>>      > On 21/09/15 21:21, Tiemen Ruiten wrote:
>>>      > > Whoops, replied off-list.
>>>      > >
>>>      > > Additionally I noticed that the generated corosync config is not
>>>      > > valid, as there is no interface section:
>>>      > >
>>>      > > /etc/corosync/corosync.conf
>>>      > >
>>>      > > totem {
>>>      > > version: 2
>>>      > > secauth: off
>>>      > > cluster_name: rd-ganesha-ha
>>>      > > transport: udpu
>>>      > > }
>>>      > >
>>>      > > nodelist {
>>>      > >   node {
>>>      > >         ring0_addr: cobalt
>>>      > >         nodeid: 1
>>>      > >        }
>>>      > >   node {
>>>      > >         ring0_addr: iron
>>>      > >         nodeid: 2
>>>      > >        }
>>>      > > }
>>>      > >
>>>      > > quorum {
>>>      > > provider: corosync_votequorum
>>>      > > two_node: 1
>>>      > > }
>>>      > >
>>>      > > logging {
>>>      > > to_syslog: yes
>>>      > > }
>>>      > >
>>>      > >
>>>      > >
>>>      >
>>>      > May be Kaleb can help you out.
>>>      > >
>>>      > > ---------- Forwarded message ----------
>>>     > > From: *Tiemen Ruiten* <[email protected] <mailto:
>>> [email protected]>
>>>     <mailto:[email protected] <mailto:[email protected]>>>
>>>     > > Date: 21 September 2015 at 17:16
>>>     > > Subject: Re: [Gluster-users] nfs-ganesha HA with arbiter volume
>>>      > > To: Jiffin Tony Thottan <[email protected]
>>>     <mailto:[email protected]> <mailto:[email protected]
>>>     <mailto:[email protected]>>>
>>>      > >
>>>      > >
>>>      > > Could you point me to the latest documentation? I've been
>>>     struggling
>>>      > > to find something up-to-date. I believe I have all the
>>>     prerequisites:
>>>      > >
>>>      > > - shared storage volume exists and is mounted
>>>      > > - all nodes in hosts files
>>>      > > - Gluster-NFS disabled
>>>      > > - corosync, pacemaker and nfs-ganesha rpm's installed
>>>      > >
>>>      > > Anything I missed?
>>>      > >
>>>      > > Everything has been installed by RPM so is in the default
>>>     locations:
>>>      > > /usr/libexec/ganesha/ganesha-ha.sh
>>>      > > /etc/ganesha/ganesha.conf (empty)
>>>      > > /etc/ganesha/ganesha-ha.conf
>>>      > >
>>>      >
>>>      > Looks fine for me.
>>>      >
>>>      > > After I started the pcsd service manually, nfs-ganesha could be
>>>      > > enabled successfully, but there was no virtual IP present on the
>>>      > > interfaces and looking at the system log, I noticed corosync
>>>     failed to
>>>      > > start:
>>>      > >
>>>      > > - on the host where I issued the gluster nfs-ganesha enable
>>>     command:
>>>      > >
>>>      > > Sep 21 17:07:18 iron systemd: Starting NFS-Ganesha file
>>> server...
>>>      > > Sep 21 17:07:19 iron systemd: Started NFS-Ganesha file server.
>>>      > > Sep 21 17:07:19 iron rpc.statd[2409]: Received SM_UNMON_ALL
>>> request
>>>      > > from iron.int.rdmedia.com <http://iron.int.rdmedia.com>
>>>     <http://iron.int.rdmedia.com> while not
>>>      > > monitoring any hosts
>>>      > > Sep 21 17:07:20 iron systemd: Starting Corosync Cluster
>>> Engine...
>>>      > > Sep 21 17:07:20 iron corosync[3426]: [MAIN  ] Corosync Cluster
>>>     Engine
>>>      > > ('2.3.4'): started and ready to provide service.
>>>      > > Sep 21 17:07:20 iron corosync[3426]: [MAIN  ] Corosync built-in
>>>      > > features: dbus systemd xmlconf snmp pie relro bindnow
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] Initializing
>>>     transport
>>>      > > (UDP/IP Unicast).
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] Initializing
>>>      > > transmit/receive security (NSS) crypto: none hash: none
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] The network
>>> interface
>>>      > > [10.100.30.38] is now up.
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [SERV  ] Service engine
>>>     loaded:
>>>      > > corosync configuration map access [0]
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [QB    ] server name: cmap
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [SERV  ] Service engine
>>>     loaded:
>>>      > > corosync configuration service [1]
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [QB    ] server name: cfg
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [SERV  ] Service engine
>>>     loaded:
>>>      > > corosync cluster closed process group service v1.01 [2]
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [QB    ] server name: cpg
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [SERV  ] Service engine
>>>     loaded:
>>>      > > corosync profile loading service [4]
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [QUORUM] Using quorum
>>> provider
>>>      > > corosync_votequorum
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all
>>>     cluster
>>>      > > members. Current votes: 1 expected_votes: 2
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [SERV  ] Service engine
>>>     loaded:
>>>      > > corosync vote quorum service v1.0 [5]
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [QB    ] server name:
>>>     votequorum
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [SERV  ] Service engine
>>>     loaded:
>>>      > > corosync cluster quorum service v0.1 [3]
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [QB    ] server name:
>>> quorum
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] adding new UDPU
>>>     member
>>>      > > {10.100.30.38}
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] adding new UDPU
>>>     member
>>>      > > {10.100.30.37}
>>>      > > Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] A new membership
>>>      > > (10.100.30.38:104 <http://10.100.30.38:104>
>>>     <http://10.100.30.38:104>) was formed. Members joined: 1
>>>     > > Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all
>>> cluster
>>>     > > members. Current votes: 1 expected_votes: 2
>>>     > > Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all
>>> cluster
>>>     > > members. Current votes: 1 expected_votes: 2
>>>     > > Sep 21 17:07:20 iron corosync[3427]: [VOTEQ ] Waiting for all
>>> cluster
>>>     > > members. Current votes: 1 expected_votes: 2
>>>     > > Sep 21 17:07:20 iron corosync[3427]: [QUORUM] Members[1]: 1
>>>     > > Sep 21 17:07:20 iron corosync[3427]: [MAIN  ] Completed service
>>>     > > synchronization, ready to provide service.
>>>     > > Sep 21 17:07:20 iron corosync[3427]: [TOTEM ] A new membership
>>>      > > (10.100.30.37:108 <http://10.100.30.37:108>
>>>     <http://10.100.30.37:108>) was formed. Members joined: 1
>>>      > > Sep 21 17:08:21 iron corosync: Starting Corosync Cluster Engine
>>>      > > (corosync): [FAILED]
>>>      > > Sep 21 17:08:21 iron systemd: corosync.service: control process
>>>      > > exited, code=exited status=1
>>>      > > Sep 21 17:08:21 iron systemd: Failed to start Corosync Cluster
>>>     Engine.
>>>      > > Sep 21 17:08:21 iron systemd: Unit corosync.service entered
>>>     failed state.
>>>      > >
>>>      > >
>>>      > > - on the other host:
>>>      > >
>>>      > > Sep 21 17:07:19 cobalt systemd: Starting Preprocess NFS
>>>     configuration...
>>>      > > Sep 21 17:07:19 cobalt systemd: Starting RPC Port Mapper.
>>>      > > Sep 21 17:07:19 cobalt systemd: Reached target RPC Port Mapper.
>>>      > > Sep 21 17:07:19 cobalt systemd: Starting Host and Network Name
>>>     Lookups.
>>>      > > Sep 21 17:07:19 cobalt systemd: Reached target Host and Network
>>>     Name
>>>      > > Lookups.
>>>      > > Sep 21 17:07:19 cobalt systemd: Starting RPC bind service...
>>>      > > Sep 21 17:07:19 cobalt systemd: Started Preprocess NFS
>>>     configuration.
>>>      > > Sep 21 17:07:19 cobalt systemd: Started RPC bind service.
>>>      > > Sep 21 17:07:19 cobalt systemd: Starting NFS status monitor for
>>>      > > NFSv2/3 locking....
>>>      > > Sep 21 17:07:19 cobalt rpc.statd[2662]: Version 1.3.0 starting
>>>      > > Sep 21 17:07:19 cobalt rpc.statd[2662]: Flags: TI-RPC
>>>      > > Sep 21 17:07:19 cobalt systemd: Started NFS status monitor for
>>>     NFSv2/3
>>>      > > locking..
>>>      > > Sep 21 17:07:19 cobalt systemd: Starting NFS-Ganesha file
>>> server...
>>>      > > Sep 21 17:07:19 cobalt systemd: Started NFS-Ganesha file server.
>>>      > > Sep 21 17:07:19 cobalt kernel: warning: `ganesha.nfsd' uses
>>> 32-bit
>>>      > > capabilities (legacy support in use)
>>>      > > Sep 21 17:07:19 cobalt logger: setting up rd-ganesha-ha
>>>      > > Sep 21 17:07:19 cobalt rpc.statd[2662]: Received SM_UNMON_ALL
>>>     request
>>>      > > from cobalt.int.rdmedia.com <http://cobalt.int.rdmedia.com>
>>>     <http://cobalt.int.rdmedia.com> while not
>>>      > > monitoring any hosts
>>>      > > Sep 21 17:07:19 cobalt logger: setting up cluster rd-ganesha-ha
>>>     with
>>>      > > the following cobalt iron
>>>      > > Sep 21 17:07:20 cobalt systemd: Stopped Pacemaker High
>>> Availability
>>>      > > Cluster Manager.
>>>      > > Sep 21 17:07:20 cobalt systemd: Stopped Corosync Cluster Engine.
>>>      > > Sep 21 17:07:20 cobalt systemd: Reloading.
>>>      > > Sep 21 17:07:20 cobalt systemd:
>>>      > > [/usr/lib/systemd/system/dm-event.socket:10] Unknown lvalue
>>>      > > 'RemoveOnStop' in section 'Socket'
>>>      > > Sep 21 17:07:20 cobalt systemd:
>>>      > > [/usr/lib/systemd/system/lvm2-lvmetad.socket:9] Unknown lvalue
>>>      > > 'RemoveOnStop' in section 'Socket'
>>>      > > Sep 21 17:07:20 cobalt systemd: Reloading.
>>>      > > Sep 21 17:07:20 cobalt systemd:
>>>      > > [/usr/lib/systemd/system/dm-event.socket:10] Unknown lvalue
>>>      > > 'RemoveOnStop' in section 'Socket'
>>>      > > Sep 21 17:07:20 cobalt systemd:
>>>      > > [/usr/lib/systemd/system/lvm2-lvmetad.socket:9] Unknown lvalue
>>>      > > 'RemoveOnStop' in section 'Socket'
>>>      > > Sep 21 17:07:20 cobalt systemd: Starting Corosync Cluster
>>> Engine...
>>>      > > Sep 21 17:07:20 cobalt corosync[2816]: [MAIN  ] Corosync Cluster
>>>      > > Engine ('2.3.4'): started and ready to provide service.
>>>      > > Sep 21 17:07:20 cobalt corosync[2816]: [MAIN  ] Corosync
>>> built-in
>>>      > > features: dbus systemd xmlconf snmp pie relro bindnow
>>>      > > Sep 21 17:07:20 cobalt corosync[2817]: [TOTEM ] Initializing
>>>     transport
>>>      > > (UDP/IP Unicast).
>>>      > > Sep 21 17:07:20 cobalt corosync[2817]: [TOTEM ] Initializing
>>>      > > transmit/receive security (NSS) crypto: none hash: none
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] The network
>>>     interface
>>>      > > [10.100.30.37] is now up.
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [SERV  ] Service engine
>>>     loaded:
>>>      > > corosync configuration map access [0]
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [QB    ] server name:
>>> cmap
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [SERV  ] Service engine
>>>     loaded:
>>>      > > corosync configuration service [1]
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [QB    ] server name: cfg
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [SERV  ] Service engine
>>>     loaded:
>>>      > > corosync cluster closed process group service v1.01 [2]
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [QB    ] server name: cpg
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [SERV  ] Service engine
>>>     loaded:
>>>      > > corosync profile loading service [4]
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Using quorum
>>>     provider
>>>      > > corosync_votequorum
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
>>>      > > cluster members. Current votes: 1 expected_votes: 2
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [SERV  ] Service engine
>>>     loaded:
>>>      > > corosync vote quorum service v1.0 [5]
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [QB    ] server name:
>>>     votequorum
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [SERV  ] Service engine
>>>     loaded:
>>>      > > corosync cluster quorum service v0.1 [3]
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [QB    ] server name:
>>> quorum
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] adding new UDPU
>>>     member
>>>      > > {10.100.30.37}
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] adding new UDPU
>>>     member
>>>      > > {10.100.30.38}
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] A new membership
>>>      > > (10.100.30.37:100 <http://10.100.30.37:100>
>>>     <http://10.100.30.37:100>) was formed. Members joined: 1
>>>     > > Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
>>>     > > cluster members. Current votes: 1 expected_votes: 2
>>>     > > Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
>>>     > > cluster members. Current votes: 1 expected_votes: 2
>>>     > > Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
>>>     > > cluster members. Current votes: 1 expected_votes: 2
>>>     > > Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Members[1]: 1
>>>     > > Sep 21 17:07:21 cobalt corosync[2817]: [MAIN  ] Completed service
>>>     > > synchronization, ready to provide service.
>>>     > > Sep 21 17:07:21 cobalt corosync[2817]: [TOTEM ] A new membership
>>>      > > (10.100.30.37:108 <http://10.100.30.37:108>
>>>     <http://10.100.30.37:108>) was formed. Members joined: 1
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [VOTEQ ] Waiting for all
>>>      > > cluster members. Current votes: 1 expected_votes: 2
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [QUORUM] Members[1]: 1
>>>      > > Sep 21 17:07:21 cobalt corosync[2817]: [MAIN  ] Completed
>>> service
>>>      > > synchronization, ready to provide service.
>>>      > > Sep 21 17:08:50 cobalt systemd: corosync.service operation
>>>     timed out.
>>>      > > Terminating.
>>>      > > Sep 21 17:08:50 cobalt corosync: Starting Corosync Cluster
>>> Engine
>>>      > > (corosync):
>>>      > > Sep 21 17:08:50 cobalt systemd: Failed to start Corosync
>>>     Cluster Engine.
>>>      > > Sep 21 17:08:50 cobalt systemd: Unit corosync.service entered
>>>     failed
>>>      > > state.
>>>      > > Sep 21 17:08:55 cobalt logger: warning: pcs property set
>>>      > > no-quorum-policy=ignore failed
>>>      > > Sep 21 17:08:55 cobalt logger: warning: pcs property set
>>>      > > stonith-enabled=false failed
>>>      > > Sep 21 17:08:55 cobalt logger: warning: pcs resource create
>>>     nfs_start
>>>      > > ganesha_nfsd ha_vol_mnt=/var/run/gluster/shared_storage --clone
>>>     failed
>>>      > > Sep 21 17:08:56 cobalt logger: warning: pcs resource delete
>>>      > > nfs_start-clone failed
>>>      > > Sep 21 17:08:56 cobalt logger: warning: pcs resource create
>>> nfs-mon
>>>      > > ganesha_mon --clone failed
>>>      > > Sep 21 17:08:56 cobalt logger: warning: pcs resource create
>>>     nfs-grace
>>>      > > ganesha_grace --clone failed
>>>      > > Sep 21 17:08:57 cobalt logger: warning pcs resource create
>>>      > > cobalt-cluster_ip-1 ocf:heartbeat:IPaddr ip= cidr_netmask=32 op
>>>      > > monitor interval=15s failed
>>>      > > Sep 21 17:08:57 cobalt logger: warning: pcs resource create
>>>      > > cobalt-trigger_ip-1 ocf:heartbeat:Dummy failed
>>>      > > Sep 21 17:08:57 cobalt logger: warning: pcs constraint
>>>     colocation add
>>>      > > cobalt-cluster_ip-1 with cobalt-trigger_ip-1 failed
>>>      > > Sep 21 17:08:57 cobalt logger: warning: pcs constraint order
>>>      > > cobalt-trigger_ip-1 then nfs-grace-clone failed
>>>      > > Sep 21 17:08:57 cobalt logger: warning: pcs constraint order
>>>      > > nfs-grace-clone then cobalt-cluster_ip-1 failed
>>>      > > Sep 21 17:08:57 cobalt logger: warning pcs resource create
>>>      > > iron-cluster_ip-1 ocf:heartbeat:IPaddr ip= cidr_netmask=32 op
>>>     monitor
>>>      > > interval=15s failed
>>>      > > Sep 21 17:08:57 cobalt logger: warning: pcs resource create
>>>      > > iron-trigger_ip-1 ocf:heartbeat:Dummy failed
>>>      > > Sep 21 17:08:57 cobalt logger: warning: pcs constraint
>>>     colocation add
>>>      > > iron-cluster_ip-1 with iron-trigger_ip-1 failed
>>>      > > Sep 21 17:08:57 cobalt logger: warning: pcs constraint order
>>>      > > iron-trigger_ip-1 then nfs-grace-clone failed
>>>      > > Sep 21 17:08:58 cobalt logger: warning: pcs constraint order
>>>      > > nfs-grace-clone then iron-cluster_ip-1 failed
>>>      > > Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
>>>      > > cobalt-cluster_ip-1 rule score=-INFINITY ganesha-active ne 1
>>> failed
>>>      > > Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
>>>      > > cobalt-cluster_ip-1 prefers iron=1000 failed
>>>      > > Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
>>>      > > cobalt-cluster_ip-1 prefers cobalt=2000 failed
>>>      > > Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
>>>      > > iron-cluster_ip-1 rule score=-INFINITY ganesha-active ne 1
>>> failed
>>>      > > Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
>>>      > > iron-cluster_ip-1 prefers cobalt=1000 failed
>>>      > > Sep 21 17:08:58 cobalt logger: warning: pcs constraint location
>>>      > > iron-cluster_ip-1 prefers iron=2000 failed
>>>      > > Sep 21 17:08:58 cobalt logger: warning pcs cluster cib-push
>>>      > > /tmp/tmp.nXTfyA1GMR failed
>>>      > > Sep 21 17:08:58 cobalt logger: warning: scp ganesha-ha.conf to
>>>     cobalt
>>>      > > failed
>>>      > >
>>>      > > BTW, I'm using CentOS 7. There are multiple network interfaces
>>>     on the
>>>      > > servers, could that be a problem?
>>>      > >
>>>      > >
>>>      > >
>>>      > >
>>>      > > On 21 September 2015 at 11:48, Jiffin Tony Thottan
>>>      > > <[email protected] <mailto:[email protected]>
>>>     <mailto:[email protected] <mailto:[email protected]>>> wrote:
>>>      > >
>>>      > >
>>>      > >
>>>      > >     On 21/09/15 13:56, Tiemen Ruiten wrote:
>>>      > >>     Hello Soumya, Kaleb, list,
>>>      > >>
>>>      > >>     This Friday I created the gluster_shared_storage volume
>>>     manually,
>>>      > >>     I just tried it with the command you supplied, but both
>>>     have the
>>>      > >>     same result:
>>>      > >>
>>>      > >>     from etc-glusterfs-glusterd.vol.log on the node where I
>>> issued
>>>      > >>     the command:
>>>      > >>
>>>      > >>     [2015-09-21 07:59:47.756845] I [MSGID: 106474]
>>>      > >>     [glusterd-ganesha.c:403:check_host_list] 0-management:
>>> ganesha
>>>      > >>     host found Hostname is cobalt
>>>      > >>     [2015-09-21 07:59:48.071755] I [MSGID: 106474]
>>>      > >>     [glusterd-ganesha.c:349:is_ganesha_host] 0-management:
>>> ganesha
>>>      > >>     host found Hostname is cobalt
>>>      > >>     [2015-09-21 07:59:48.653879] E [MSGID: 106470]
>>>      > >>     [glusterd-ganesha.c:264:glusterd_op_set_ganesha]
>>> 0-management:
>>>      > >>     Initial NFS-Ganesha set up failed
>>>      > >
>>>      > >     As far as what I understand from the logs, it called
>>>      > >     setup_cluser()[calls `ganesha-ha.sh` script ] but script
>>>     failed.
>>>      > >     Can u please provide following details :
>>>      > >     -Location of ganesha.sh file??
>>>      > >     -Location of ganesha-ha.conf, ganesha.conf files ?
>>>      > >
>>>      > >
>>>      > >     And also can u cross check whether all the prerequisites
>>>     before HA
>>>      > >     setup satisfied ?
>>>      > >
>>>      > >     --
>>>      > >     With Regards,
>>>      > >     Jiffin
>>>      > >
>>>      > >
>>>      > >>     [2015-09-21 07:59:48.653912] E [MSGID: 106123]
>>>      > >>     [glusterd-syncop.c:1404:gd_commit_op_phase] 0-management:
>>>     Commit
>>>      > >>     of operation 'Volume (null)' failed on localhost : Failed
>>>     to set
>>>      > >>     up HA config for NFS-Ganesha. Please check the log file
>>>     for details
>>>      > >>     [2015-09-21 07:59:45.402458] I [MSGID: 106006]
>>>      > >>     [glusterd-svc-mgmt.c:323:glusterd_svc_common_rpc_notify]
>>>      > >>     0-management: nfs has disconnected from glusterd.
>>>      > >>     [2015-09-21 07:59:48.071578] I [MSGID: 106474]
>>>      > >>     [glusterd-ganesha.c:403:check_host_list] 0-management:
>>> ganesha
>>>      > >>     host found Hostname is cobalt
>>>      > >>
>>>      > >>     from etc-glusterfs-glusterd.vol.log on the other node:
>>>      > >>
>>>      > >>     [2015-09-21 08:12:50.111877] E [MSGID: 106062]
>>>      > >>     [glusterd-op-sm.c:3698:glusterd_op_ac_unlock] 0-management:
>>>      > >>     Unable to acquire volname
>>>      > >>     [2015-09-21 08:14:50.548087] E [MSGID: 106062]
>>>      > >>     [glusterd-op-sm.c:3635:glusterd_op_ac_lock] 0-management:
>>>     Unable
>>>      > >>     to acquire volname
>>>      > >>     [2015-09-21 08:14:50.654746] I [MSGID: 106132]
>>>      > >>     [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management:
>>> nfs
>>>      > >>     already stopped
>>>      > >>     [2015-09-21 08:14:50.655095] I [MSGID: 106474]
>>>      > >>     [glusterd-ganesha.c:403:check_host_list] 0-management:
>>> ganesha
>>>      > >>     host found Hostname is cobalt
>>>      > >>     [2015-09-21 08:14:51.287156] E [MSGID: 106062]
>>>      > >>     [glusterd-op-sm.c:3698:glusterd_op_ac_unlock] 0-management:
>>>      > >>     Unable to acquire volname
>>>      > >>
>>>      > >>
>>>      > >>     from etc-glusterfs-glusterd.vol.log on the arbiter node:
>>>      > >>
>>>      > >>     [2015-09-21 08:18:50.934713] E [MSGID: 101075]
>>>      > >>     [common-utils.c:3127:gf_is_local_addr] 0-management: error
>>> in
>>>      > >>     getaddrinfo: Name or service not known
>>>      > >>     [2015-09-21 08:18:51.504694] E [MSGID: 106062]
>>>      > >>     [glusterd-op-sm.c:3698:glusterd_op_ac_unlock] 0-management:
>>>      > >>     Unable to acquire volname
>>>      > >>
>>>      > >>     I have put the hostnames of all servers in my /etc/hosts
>>> file,
>>>      > >>     including the arbiter node.
>>>      > >>
>>>      > >>
>>>      > >>     On 18 September 2015 at 16:52, Soumya Koduri
>>>     <[email protected] <mailto:[email protected]>
>>>      > >>     <mailto:[email protected] <mailto:[email protected]>>>
>>>
>>>     wrote:
>>>      > >>
>>>      > >>         Hi Tiemen,
>>>      > >>
>>>      > >>         One of the pre-requisites before setting up
>>>     nfs-ganesha HA is
>>>      > >>         to create and mount shared_storage volume. Use below
>>>     CLI for that
>>>      > >>
>>>      > >>         "gluster volume set all cluster.enable-shared-storage
>>>     enable"
>>>      > >>
>>>      > >>         It shall create the volume and mount in all the nodes
>>>      > >>         (including the arbiter node). Note this volume shall be
>>>      > >>         mounted on all the nodes of the gluster storage pool
>>>     (though
>>>      > >>         in this case it may not be part of nfs-ganesha
>>> cluster).
>>>      > >>
>>>      > >>         So instead of manually creating those directory paths,
>>>     please
>>>      > >>         use above CLI and try re-configuring the setup.
>>>      > >>
>>>      > >>         Thanks,
>>>      > >>         Soumya
>>>      > >>
>>>      > >>         On 09/18/2015 07:29 PM, Tiemen Ruiten wrote:
>>>      > >>
>>>      > >>             Hello Kaleb,
>>>      > >>
>>>      > >>             I don't:
>>>      > >>
>>>      > >>             # Name of the HA cluster created.
>>>      > >>             # must be unique within the subnet
>>>      > >>             HA_NAME="rd-ganesha-ha"
>>>      > >>             #
>>>      > >>             # The gluster server from which to mount the
>>>     shared data
>>>      > >>             volume.
>>>      > >>             HA_VOL_SERVER="iron"
>>>      > >>             #
>>>      > >>             # N.B. you may use short names or long names; you
>>>     may not
>>>      > >>             use IP addrs.
>>>      > >>             # Once you select one, stay with it as it will be
>>>     mildly
>>>      > >>             unpleasant to
>>>      > >>             # clean up if you switch later on. Ensure that all
>>>     names
>>>      > >>             - short and/or
>>>      > >>             # long - are in DNS or /etc/hosts on all machines
>>>     in the
>>>      > >>             cluster.
>>>      > >>             #
>>>      > >>             # The subset of nodes of the Gluster Trusted Pool
>>> that
>>>      > >>             form the ganesha
>>>      > >>             # HA cluster. Hostname is specified.
>>>      > >>             HA_CLUSTER_NODES="cobalt,iron"
>>>      > >>             #HA_CLUSTER_NODES="server1.lab.redhat.com
>>>     <http://server1.lab.redhat.com>
>>>      > >>             <http://server1.lab.redhat.com>
>>>      > >>
>>>       <http://server1.lab.redhat.com>,server2.lab.redhat.com
>>>     <http://server2.lab.redhat.com>
>>>      > >>             <http://server2.lab.redhat.com>
>>>      > >>             <http://server2.lab.redhat.com>,..."
>>>      > >>             #
>>>      > >>             # Virtual IPs for each of the nodes specified
>>> above.
>>>      > >>             VIP_server1="10.100.30.101"
>>>      > >>             VIP_server2="10.100.30.102"
>>>      > >>             #VIP_server1_lab_redhat_com="10.0.2.1"
>>>      > >>             #VIP_server2_lab_redhat_com="10.0.2.2"
>>>      > >>
>>>      > >>             hosts cobalt & iron are the data nodes, the arbiter
>>>      > >>             ip/hostname (neon)
>>>      > >>             isn't mentioned anywhere in this config file.
>>>      > >>
>>>      > >>
>>>      > >>             On 18 September 2015 at 15:56, Kaleb S. KEITHLEY
>>>      > >>             <[email protected] <mailto:[email protected]>
>>>     <mailto:[email protected] <mailto:[email protected]>>
>>>      > >>             <mailto:[email protected]
>>>     <mailto:[email protected]>
>>>      > >>             <mailto:[email protected]
>>>     <mailto:[email protected]>>>> wrote:
>>>      > >>
>>>      > >>                 On 09/18/2015 09:46 AM, Tiemen Ruiten wrote:
>>>      > >>                 > Hello,
>>>      > >>                 >
>>>      > >>                 > I have a Gluster cluster with a single
>>>     replica 3,
>>>      > >>             arbiter 1 volume (so
>>>      > >>                 > two nodes with actual data, one arbiter
>>> node). I
>>>      > >>             would like to setup
>>>      > >>                 > NFS-Ganesha HA for this volume but I'm
>>>     having some
>>>      > >>             difficulties.
>>>      > >>                 >
>>>      > >>                 > - I needed to create a directory
>>>      > >>             /var/run/gluster/shared_storage
>>>      > >>                 > manually on all nodes, or the command
>>> 'gluster
>>>      > >>             nfs-ganesha enable would
>>>      > >>                 > fail with the following error:
>>>      > >>                 > [2015-09-18 13:13:34.690416] E [MSGID:
>>> 106032]
>>>      > >>                 > [glusterd-ganesha.c:708:pre_setup]
>>> 0-THIS->name:
>>>      > >>             mkdir() failed on path
>>>      > >>                 > /var/run/gluster/shared_storage/nfs-ganesha,
>>> [No
>>>      > >>             such file or directory]
>>>      > >>                 >
>>>      > >>                 > - Then I found out that the command connects
>>>     to the
>>>      > >>             arbiter node as
>>>      > >>                 > well, but obviously I don't want to set up
>>>      > >>             NFS-Ganesha there. Is it
>>>      > >>                 > actually possible to setup NFS-Ganesha HA
>>>     with an
>>>      > >>             arbiter node? If it's
>>>      > >>                 > possible, is there any documentation on how
>>>     to do that?
>>>      > >>                 >
>>>      > >>
>>>      > >>                 Please send the /etc/ganesha/ganesha-ha.conf
>>> file
>>>      > >>             you're using.
>>>      > >>
>>>      > >>                 Probably you have included the arbiter in your
>>> HA
>>>      > >>             config; that would be
>>>      > >>                 a mistake.
>>>      > >>
>>>      > >>                 --
>>>      > >>
>>>      > >>                 Kaleb
>>>      > >>
>>>      > >>
>>>      > >>
>>>      > >>
>>>      > >>             --
>>>      > >>             Tiemen Ruiten
>>>      > >>             Systems Engineer
>>>      > >>             R&D Media
>>>      > >>
>>>      > >>
>>>      > >>             _______________________________________________
>>>      > >>             Gluster-users mailing list
>>>      > >> [email protected] <mailto:[email protected]>
>>>     <mailto:[email protected] <mailto:[email protected]
>>> >>
>>>     > >>http://www.gluster.org/mailman/listinfo/gluster-users
>>>     > >>
>>>     > >>
>>>     > >>
>>>     > >>
>>>     > >>     --
>>>     > >>     Tiemen Ruiten
>>>     > >>     Systems Engineer
>>>     > >>     R&D Media
>>>     > >>
>>>     > >>
>>>     > >>     _______________________________________________
>>>     > >>     Gluster-users mailing list
>>>      > >> [email protected] <mailto:[email protected]>
>>>     <mailto:[email protected] <mailto:[email protected]
>>> >>
>>>     > >>http://www.gluster.org/mailman/listinfo/gluster-users
>>>     > >
>>>     > >
>>>     > >     _______________________________________________
>>>     > >     Gluster-users mailing list
>>>      > > [email protected] <mailto:[email protected]>
>>>     <mailto:[email protected] <mailto:[email protected]
>>> >>
>>>
>>>     > >http://www.gluster.org/mailman/listinfo/gluster-users
>>>     > >
>>>     > >
>>>     > >
>>>     > >
>>>     > > --
>>>     > > Tiemen Ruiten
>>>     > > Systems Engineer
>>>     > > R&D Media
>>>     > >
>>>     > >
>>>     > >
>>>     > > --
>>>     > > Tiemen Ruiten
>>>     > > Systems Engineer
>>>     > > R&D Media
>>>     > >
>>>     > >
>>>     > > _______________________________________________
>>>     > > Gluster-users mailing list
>>>     > >[email protected] <mailto:[email protected]>
>>>     > >http://www.gluster.org/mailman/listinfo/gluster-users
>>>     >
>>>     >
>>>     _______________________________________________
>>>     Gluster-users mailing list
>>>     [email protected] <mailto:[email protected]>
>>>     http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>
>>>
>>> --
>>> Tiemen Ruiten
>>> Systems Engineer
>>> R&D Media
>>>
>>>
>>> _______________________________________________
>>> Gluster-users mailing list
>>> [email protected]
>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>
>
> --
> Tiemen Ruiten
> Systems Engineer
> R&D Media
>



-- 
Tiemen Ruiten
Systems Engineer
R&D Media

_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Fwd: nfs-ganesha HA with arbiter volume

Reply via email to