Using the same two-node configuration I described in an earlier post this 
forum, I'm having problems getting a gfs2 resource started on one of the nodes. 
The resource in question:

 Resource: clusterfs (class=ocf provider=heartbeat type=Filesystem)
  Attributes: device=/dev/vg_cluster/ha_lv directory=/mnt/gfs2-demo fstype=gfs2 
options=noatime 
  Operations: start interval=0s timeout=60 (clusterfs-start-timeout-60)
              stop interval=0s timeout=60 (clusterfs-stop-timeout-60)
              monitor interval=10s on-fail=fence 
(clusterfs-monitor-interval-10s)

pcs status shows:

Clone Set: dlm-clone [dlm]
     Started: [ rh7cn1.devlab.sinenomine.net rh7cn2.devlab.sinenomine.net ]
 Clone Set: clvmd-clone [clvmd]
     Started: [ rh7cn1.devlab.sinenomine.net rh7cn2.devlab.sinenomine.net ]
 Clone Set: clusterfs-clone [clusterfs]
     Started: [ rh7cn1.devlab.sinenomine.net ]
     Stopped: [ rh7cn2.devlab.sinenomine.net ]

Failed actions:
    clusterfs_start_0 on rh7cn2.devlab.sinenomine.net 'unknown error' (1): 
call=46, status=complete, last-rc-change='Fri Oct  3 14:41:26 2014', 
queued=4702ms, exec=0ms

Using pcs resource debug-start I see:

Operation start for clusterfs:0 (ocf:heartbeat:Filesystem) returned 1
 >  stderr: INFO: Running start for /dev/vg_cluster/ha_lv on /mnt/gfs2-demo
 >  stderr: mount: permission denied
 >  stderr: ERROR: Couldn't mount filesystem /dev/vg_cluster/ha_lv on 
 > /mnt/gfs2-demo

The log on the node shows - 

Oct  3 14:57:37 rh7cn2 kernel: GFS2: fsid=rh7cluster:vol1: Trying to join 
cluster "lock_dlm", "rh7cluster:vol1"
Oct  3 14:57:38 rh7cn2 kernel: GFS2: fsid=rh7cluster:vol1: Joined cluster. Now 
mounting FS...
Oct  3 14:57:38 rh7cn2 dlm_controld[5857]: 1564 cpg_dispatch error 9

On the other node - 

Oct  3 15:09:47 rh7cn1 kernel: GFS2: fsid=rh7cluster:vol1.0: recover generation 
14 done
Oct  3 15:09:48 rh7cn1 kernel: GFS2: fsid=rh7cluster:vol1.0: recover generation 
15 done

I'm assuming I didn't define the gfs2 resource such that it could be used 
concurrently by both nodes. Here's the cib.xml definition for it:

      <clone id="clusterfs-clone">
        <primitive class="ocf" id="clusterfs" provider="heartbeat" 
type="Filesystem">
          <instance_attributes id="clusterfs-instance_attributes">
            <nvpair id="clusterfs-instance_attributes-device" name="device" 
value="/dev/vg_cluster/ha_lv"/>
            <nvpair id="clusterfs-instance_attributes-directory" 
name="directory" value="/mnt/gfs2-demo"/>
            <nvpair id="clusterfs-instance_attributes-fstype" name="fstype" 
value="gfs2"/>
            <nvpair id="clusterfs-instance_attributes-options" name="options" 
value="noatime"/>
          </instance_attributes>
          <operations>
            <op id="clusterfs-start-timeout-60" interval="0s" name="start" 
timeout="60"/>
            <op id="clusterfs-stop-timeout-60" interval="0s" name="stop" 
timeout="60"/>
            <op id="clusterfs-monitor-interval-10s" interval="10s" 
name="monitor" on-fail="fence"/>
          </operations>
        </primitive>
        <meta_attributes id="clusterfs-clone-meta">
          <nvpair id="clusterfs-interleave" name="interleave" value="true"/>
        </meta_attributes>
      </clone>

-------------------------------

Unrelated (I believe) to the above, I also note the following messages in 
/var/log/messages which appear to be related to pacemaker and http (another 
resource I have defined):

Oct  3 15:05:06 rh7cn2 systemd: pacemaker.service: Got notification message 
from PID 6036, but reception only permitted for PID 5575

I'm running systemd-208-11.el7_0.2. A bugzilla search matches with one report 
but the fix was put into -11.

Neale

-- 
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to