Public bug reported:
Environment:
Oracle Cloud Infrastructure (OCI)
2 Nodes + ISCSI block volume with OCFS2
Operating System: Ubuntu 20.04.3 LTS
Kernel: Linux 5.11.0-1027-oracle
ocfs2-tools: 1.8.6-2ubuntu1
Problem:
OCFS2 is intermittently not mountable on a second Node.
Sometimes it works for hours and many reboots.
Sometimes it hangs for hours, even after reboot.
In case of error: If OCFS2 is umounted from first node, mount on second node
works.
But sometimes unmount from first node is not possible (hangs and sometimes
whole current SSH shell hangs).
ISCSI volume is attached and visible.
O2CB runs without errors.
In older Ubuntu/OCFS2 version this does not happen, e. g.:
Operating System: Ubuntu 18.04.6 LTS
Kernel: Linux 5.4.0-1061-oracle
ocfs2-tools: 1.8.5-3ubuntu1
Some error and log messages:
Boot screen on no error:
# console (good, mounted after reboot)
[ OK ] Started Disk Manager.
[ OK ] Finished Load o2cb Modules.
Starting Mount ocfs2 Filesystems...
[ OK ] Found device BlockVolume ubuntu20-ocfs.
Mounting /ocfs2volume...
Boot screen on error:
[ OK ] Found device BlockVolume ubuntu20-ocfs.
Mounting /ocfs2volume...
[FAILED] Failed to mount /ocfs2volume.
See 'systemctl status ocfs2volume.mount' for details.
[DEPEND] Dependency failed for Remote File Systems.
...
node2 login: [ 242.574617] INFO: task mount.ocfs2:1000 blocked for more than
120 seconds.
[ 242.576764] Not tainted 5.11.0-1027-oracle #30~20.04.1-Ubuntu
[ 242.578017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 363.405225] INFO: task mount.ocfs2:1000 blocked for more than 241 seconds.
[ 363.407645] Not tainted 5.11.0-1027-oracle #30~20.04.1-Ubuntu
[ 363.410559] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 484.230345] INFO: task mount.ocfs2:1000 blocked for more than 362 seconds.
[ 484.232416] Not tainted 5.11.0-1027-oracle #30~20.04.1-Ubuntu
[ 484.233670] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
# systemctl status ocfs2volume.mount
● ocfs2volume.mount - /ocfs2volume
Loaded: loaded (/etc/fstab; generated)
Active: failed (Result: timeout) since Wed 2022-01-19 13:02:04 CET; 8min
ago
Where: /ocfs2volume
What: /dev/disk/by-uuid/7da60327-fd56-4abd-8f96-4c96260bb2c7
Docs: man:fstab(5)
man:systemd-fstab-generator(8)
Tasks: 1 (limit: 19138)
Memory: 756.0K
CGroup: /system.slice/ocfs2volume.mount
└─1000 /sbin/mount.ocfs2 /dev/sdb /ocfs2volume -o rw,acl,_netdev
Jan 19 13:00:34 node2 systemd[1]: Mounting /ocfs2volume...
Jan 19 13:02:04 node2 systemd[1]: ocfs2volume.mount: Mounting timed out.
Terminating.
Jan 19 13:02:04 node2 systemd[1]: ocfs2volume.mount: Mount process exited,
code=killed, status=15/TERM
Jan 19 13:02:04 node2 systemd[1]: ocfs2volume.mount: Failed with result
'timeout'.
Jan 19 13:02:04 node2 systemd[1]: Failed to mount /ocfs2volume.
# systemctl status ocfs2
● ocfs2.service - Mount ocfs2 Filesystems
Loaded: loaded (/lib/systemd/system/ocfs2.service; enabled; vendor preset:
enabled)
Active: active (exited) since Wed 2022-01-19 13:00:31 CET; 11min ago
Docs: man:ocfs2(7)
man:mount.ocfs2(8)
Process: 934 ExecStart=/usr/lib/ocfs2-tools/ocfs2 start (code=exited,
status=0/SUCCESS)
Main PID: 934 (code=exited, status=0/SUCCESS)
Jan 19 13:00:31 node2 systemd[1]: Starting Mount ocfs2 Filesystems...
Jan 19 13:00:31 node2 ocfs2[934]: Starting Oracle Cluster File System (OCFS2)
Jan 19 13:00:31 node2 ocfs2[939]: mount: /ocfs2volume: can't find
UUID=7da60327-fd56-4abd-8f96-4c96260bb2c7.
Jan 19 13:00:31 node2 systemd[1]: Finished Mount ocfs2 Filesystems.
Jan 19 13:00:31 node2 ocfs2[934]: Failed
# INFO: UUID=7da60327-fd56-4abd-8f96-4c96260bb2c7 was present
# But errors differ in differnent runs
# iscsiadm -m session -P 3
************************
Attached SCSI devices:
************************
Host Number: 3 State: running
scsi3 Channel 00 Id 0 Lun: 0
scsi3 Channel 00 Id 0 Lun: 2
Attached scsi disk sdb State: running
# dmesg
[ 1.337891] scsi 2:0:0:1: Direct-Access ORACLE BlockVolume 1.0
PQ: 0 ANSI: 6
...
[ 7.886896] scsi host3: scsi_eh_3: sleeping
[ 7.897819] scsi host3: iSCSI Initiator over TCP/IP
[ 8.198102] ocfs2: Registered cluster interface o2cb
[ 8.244150] OCFS2 User DLM kernel interface loaded
[ 8.272364] o2hb: Heartbeat mode set to local
[ 8.355866] scsi 3:0:0:0: RAID IET Controller 0001
PQ: 0 ANSI: 5
[ 8.358478] scsi 3:0:0:0: Attached scsi generic sg1 type 12
[ 8.360633] scsi 3:0:0:2: Direct-Access ORACLE BlockVolume 1.0
PQ: 0 ANSI: 6
[ 8.361620] sd 3:0:0:2: Attached scsi generic sg2 type 0
[ 8.373937] sd 3:0:0:2: [sdb] 2147483648 512-byte logical blocks: (1.10
TB/1.00 TiB)
[ 8.373941] sd 3:0:0:2: [sdb] 4096-byte physical blocks
[ 8.374114] sd 3:0:0:2: [sdb] Write Protect is off
[ 8.374117] sd 3:0:0:2: [sdb] Mode Sense: 2b 00 10 08
[ 8.374552] sd 3:0:0:2: [sdb] Write cache: disabled, read cache: enabled,
supports DPO and FUA
[ 8.374925] sd 3:0:0:2: [sdb] Optimal transfer size 1048576 bytes
[ 8.436917] sd 3:0:0:2: [sdb] Attached SCSI disk
...
[ 14.820725] o2net: Connected to node node1 (num 0) at 192.168.14.50:7777
[ 18.832591] o2dlm: Joining domain 7DA60327FD564ABD8F964C96260BB2C7
[ 18.832595] (
[ 18.832595] 0
[ 18.832596] 1
[ 18.832597] ) 2 nodes
[ 242.574617] INFO: task mount.ocfs2:1000 blocked for more than 120 seconds.
[ 242.576764] Not tainted 5.11.0-1027-oracle #30~20.04.1-Ubuntu
[ 242.578017] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 242.579678] task:mount.ocfs2 state:D stack: 0 pid: 1000 ppid: 1
flags:0x00004000
[ 242.579684] Call Trace:
[ 242.579689] __schedule+0x44c/0x8a0
[ 242.579696] schedule+0x4f/0xc0
[ 242.579697] schedule_timeout+0x202/0x290
[ 242.579700] wait_for_completion+0x94/0x100
[ 242.579705] __ocfs2_cluster_lock.isra.0+0x54b/0x820 [ocfs2]
[ 242.579747] ocfs2_inode_lock_full_nested+0x116/0x470 [ocfs2]
[ 242.579788] ? ocfs2_inode_lock_full_nested+0x116/0x470 [ocfs2]
[ 242.579825] ocfs2_journal_init+0x98/0x340 [ocfs2]
[ 242.579863] ? ocfs2_get_system_file_inode+0x14e/0x310 [ocfs2]
[ 242.579906] ocfs2_check_volume+0x3b/0x4e0 [ocfs2]
[ 242.579942] ? iput+0x7a/0x200
[ 242.579947] ocfs2_mount_volume.isra.0+0x108/0x430 [ocfs2]
[ 242.579986] ocfs2_fill_super+0x961/0xda0 [ocfs2]
[ 242.580026] mount_bdev+0x18d/0x1c0
[ 242.580030] ? ocfs2_initialize_super.isra.0+0x1000/0x1000 [ocfs2]
[ 242.580070] ocfs2_mount+0x15/0x20 [ocfs2]
[ 242.580108] legacy_get_tree+0x2b/0x50
[ 242.580113] vfs_get_tree+0x2a/0xc0
[ 242.580116] ? capable+0x19/0x20
[ 242.580120] path_mount+0x460/0xa50
[ 242.580124] do_mount+0x7c/0xa0
[ 242.580126] __x64_sys_mount+0x8b/0xe0
[ 242.580129] do_syscall_64+0x38/0x90
[ 242.580133] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 242.580136] RIP: 0033:0x7f5d2b9ecdde
[ 242.580138] RSP: 002b:00007ffe1754ab38 EFLAGS: 00000246 ORIG_RAX:
00000000000000a5
[ 242.580141] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5d2b9ecdde
[ 242.580143] RDX: 000055fb738f40ae RSI: 000055fb74188340 RDI: 000055fb7418f430
[ 242.580145] RBP: 00007ffe1754ace0 R08: 000055fb7418f410 R09: 00007ffe17548570
[ 242.580146] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe1754abd0
[ 242.580147] R13: 00007ffe1754ab50 R14: 000055fb7418f410 R15: 0000000000000000
# @shutdown:
[*** ] A stop job is running for Login to …lt iSCSI targets (2min 47s / 3min)
[ *] A stop job is running for Login to …lt iSCSI targets (5min 24s / 6min)
** Affects: ocfs2-tools (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1959581
Title:
OCFS2 intermittently not mountable on a second Node in Ubuntu 20.04.3
LTS
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ocfs2-tools/+bug/1959581/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs