Hi,
I have same problem than you with debian jessie and kernel 4.7.
works fine with kernel 3.16


I seem that on kernel 3.16,
the sysfs  threshold was

/sys/kernel/config/cluster/ocfs2/heartbeat/dead_threshold

and now on 4.7
/sys/kernel/config/cluster/ocfs2/heartbeat/threshold

the o2cb init script is setting value in 
/sys/kernel/config/cluster/ocfs2/heartbeat/dead_threshold

(I have checked in ocfs2-tool git, this is the same old sysfs key)

Changing the script help, but sometimes it don't work. Maybe other keys
have change.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to ocfs2-tools in Ubuntu.
https://bugs.launchpad.net/bugs/1614038

Title:
  o2cb configuration options ignored in 16.04

Status in ocfs2-tools package in Ubuntu:
  Triaged

Bug description:
  We've been trying to add a 16.04 node (ocfs2-tools 1.6.4-3.1) to our
  existing OCFS2 filesystem based on Ubuntu 13.04 (ocfs2-tools
  1.6.4-2ubuntu1) and , Ubuntu 14.04 (ocfs2-tool 1.6.4-3ubuntu1)

   * Node1: Ubuntu 16.04, Slot 1, 10.22.44.21
   * Node2: Ubuntu 13.04, Slot 2, 10.22.44.22
   * Node3: Ubuntu 14.04, Slot 6, 10.22.44.23
   * Node4: Ubuntu 14.04, Slot 7, 10.22.44.24

  
  The exiting system has a O2CB_HEARTBEAT_THRESHOLD=61 setting, but when adding 
the new system these tweaks seem to be ignored.  Here's the syslog section:

  
  Aug 16 15:58:02 node1 kernel: [  936.294820] 
(o2hb-37AAEB0304,5741,7):o2hb_check_slot:895 ERROR: Node 2 on device sdc has a 
dead count of 122000 ms, but our count is 62000 ms.
  Aug 16 15:58:02 node1 kernel: [  936.294820] Please double check your 
configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
  Aug 16 15:58:02 node1 kernel: [  936.294949] 
(o2hb-37AAEB0304,5741,7):o2hb_check_slot:895 ERROR: Node 6 on device sdc has a 
dead count of 122000 ms, but our count is 62000 ms.
  Aug 16 15:58:02 node1 kernel: [  936.294949] Please double check your 
configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
  Aug 16 15:58:02 node1 kernel: [  936.295071] 
(o2hb-37AAEB0304,5741,7):o2hb_check_slot:895 ERROR: Node 7 on device sdc has a 
dead count of 122000 ms, but our count is 62000 ms.
  Aug 16 15:58:02 node1 kernel: [  936.295071] Please double check your 
configuration values for 'O2CB_HEARTBEAT_THRESHOLD'
  Aug 16 15:58:03 node1 kernel: [  937.123350] o2net: node node3 (num 6) at 
10.22.44.23:7777 uses a heartbeat timeout of 120000 ms, but we use 60000 ms 
locally. Disconnecting.
  Aug 16 15:58:03 node1 kernel: [  937.393608] o2net: node node2 (num 2) at 
10.22.44.22:7777 uses a heartbeat timeout of 120000 ms, but we use 60000 ms 
locally. Disconnecting.
  Aug 16 15:58:04 node1 kernel: [  938.055983] o2net: node node4 (num 7) at 
10.22.44.24:7777 uses a heartbeat timeout of 120000 ms, but we use 60000 ms 
locally. Disconnecting.
  Aug 16 15:58:29 node1 kernel: [  963.213554] o2net: node node3 (num 6) at 
10.22.44.23:7777 uses a heartbeat timeout of 120000 ms, but we use 60000 ms 
locally. Disconnecting.
  Aug 16 15:58:30 node1 kernel: [  964.057995] o2net: node node4 (num 7) at 
10.22.44.24:7777 uses a heartbeat timeout of 120000 ms, but we use 60000 ms 
locally. Disconnecting.
  Aug 16 15:58:32 node1 kernel: [  966.404380] o2net: No connection established 
with node 2 after 30.0 seconds, check network and cluster configuration.
  Aug 16 15:58:32 node1 kernel: [  966.404390] o2net: No connection established 
with node 6 after 30.0 seconds, check network and cluster configuration.
  Aug 16 15:58:32 node1 kernel: [  966.404393] o2net: No connection established 
with node 7 after 30.0 seconds, check network and cluster configuration.
  Aug 16 15:58:59 node1 kernel: [  993.296012] o2net: node node3 (num 6) at 
10.22.44.23:7777 uses a heartbeat timeout of 120000 ms, but we use 60000 ms 
locally. Disconnecting.
  Aug 16 15:59:00 node1 kernel: [  994.060435] o2net: node node4 (num 7) at 
10.22.44.24:7777 uses a heartbeat timeout of 120000 ms, but we use 60000 ms 
locally. Disconnecting.
  Aug 16 15:59:02 node1 kernel: [  996.486396] o2net: No connection established 
with node 2 after 30.0 seconds, check network and cluster configuration.
  Aug 16 15:59:02 node1 kernel: [  996.486405] o2net: No connection established 
with node 6 after 30.0 seconds, check network and cluster configuration.
  Aug 16 15:59:02 node1 kernel: [  996.486409] o2net: No connection established 
with node 7 after 30.0 seconds, check network and cluster configuration.
  Aug 16 15:59:05 node1 kernel: [  999.582560] o2cb: This node could not 
connect to nodes: 2 6 7.
  Aug 16 15:59:05 node1 kernel: [  999.582607] o2cb: Cluster check failed. Fix 
errors before retrying.
  Aug 16 15:59:05 node1 kernel: [  999.582647] 
(mount.ocfs2,5740,1):ocfs2_dlm_init:3025 ERROR: status = -107
  Aug 16 15:59:05 node1 kernel: [  999.582814] 
(mount.ocfs2,5740,1):ocfs2_mount_volume:1863 ERROR: status = -107
  Aug 16 15:59:05 node1 kernel: [  999.582895] ocfs2: Unmounting device (8,32) 
on (node 0)
  Aug 16 15:59:05 node1 kernel: [  999.582905] 
(mount.ocfs2,5740,1):ocfs2_fill_super:1219 ERROR: status = -107

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ocfs2-tools/+bug/1614038/+subscriptions

_______________________________________________
Mailing list: https://launchpad.net/~ubuntu-ha
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp

Reply via email to