I'm happy to create a new bug report for this, however before I do I
wanted to follow up here first.  I've been working on a bionic VM
template this week and the issue has resurfaced.  Client (18.04) reboots
daily at 3:00 a.m., and somewhere between 30 minutes and 2 hours later,
the CIFS mount point stops responding.  Meanwhile other clients (16.04,
and Windows) continue chugging along merrily.  A reboot sometimes fixes
the problem, and sometimes the problem has fixed itself by 8am when I
arrive.  Here's some syslog debug output after the machine finishes
booting.

Yesterday it cleared up on its own.  Today the server is still down 10
hours later.

I would blame Java, except that the whole mount point becomes non-
responsive when this happens, not just for that one process.

Jun  6 03:00:37 localhost systemd[1]: Reached target Multi-User System.
Jun  6 03:00:37 localhost systemd[1]: Starting Execute cloud user/final 
scripts...
Jun  6 03:00:37 localhost systemd[1]: Reached target Graphical Interface.
Jun  6 03:00:37 localhost systemd[1]: Starting Update UTMP about System 
Runlevel Changes...
Jun  6 03:00:38 localhost systemd[1]: Started Update UTMP about System Runlevel 
Changes.
Jun  6 03:00:38 localhost cloud-init[1531]: Cloud-init v. 18.2 running 
'modules:final' at Wed, 06 Jun 2018 03:00:38 +0000. Up 23.72 seconds.
Jun  6 03:00:38 localhost cloud-init[1531]: Cloud-init v. 18.2 finished at Wed, 
06 Jun 2018 03:00:38 +0000. Datasource DataSourceNoCloud 
[seed=/var/lib/cloud/seed/nocloud-net][dsmode=net].  Up 23.84 seconds
Jun  6 03:00:38 localhost systemd[1]: Started Execute cloud user/final scripts.
Jun  6 03:00:38 localhost systemd[1]: Reached target Cloud-init target.
Jun  6 03:00:38 localhost systemd[1]: Startup finished in 2.806s (kernel) + 
21.078s (userspace) = 23.885s.
Jun  6 03:00:39 localhost kernel: [   24.927412] TCP: ens160: Driver has 
suspect GRO implementation, TCP performance may be compromised.
Jun  6 03:00:51 localhost systemd-timesyncd[574]: Synchronized to time server 
91.189.91.157:123 (ntp.ubuntu.com).
Jun  6 03:14:28 localhost systemd[1]: Starting Message of the Day...
Jun  6 03:14:30 localhost 50-motd-news[1699]:  * Meltdown, Spectre and Ubuntu: 
What are the attack vectors,
Jun  6 03:14:30 localhost 50-motd-news[1699]:    how the fixes work, and 
everything else you need to know
Jun  6 03:14:30 localhost 50-motd-news[1699]:    - https://ubu.one/u2Know
Jun  6 03:14:30 localhost systemd[1]: Started Message of the Day.
Jun  6 03:15:48 localhost systemd[1]: Starting Cleanup of Temporary 
Directories...
Jun  6 03:15:48 localhost systemd[1]: Started Cleanup of Temporary Directories.
Jun  6 03:17:01 localhost CRON[1770]: (root) CMD (   cd / && run-parts --report 
/etc/cron.hourly)
Jun  6 04:00:01 localhost CRON[1878]: (root) CMD 
(/mnt/www/config/backup_config.sh)
Jun  6 04:17:01 localhost CRON[1916]: (root) CMD (   cd / && run-parts --report 
/etc/cron.hourly)
Jun  6 04:17:33 localhost nslcd[1438]: [b141f2] <group/member="root"> failed to 
bind to LDAP server ldap://dc01.example.com: Can't contact LDAP server
Jun  6 04:17:33 localhost nslcd[1438]: [b141f2] <group/member="root"> connected 
to LDAP server ldap://dc02.example.com
Jun  6 04:30:01 localhost CRON[1938]: (root) CMD (/usr/sbin/ntpdate 
time-a.nist.gov time-b.nist.gov 0.pool.ntp.org 1.pool.ntp.org)
Jun  6 04:30:01 localhost CRON[1937]: (CRON) info (No MTA installed, discarding 
output)
Jun  6 05:12:02 localhost kernel: [ 7906.798052] CIFS VFS: Server 
cifshost.example.com has not responded in 120 seconds. Reconnecting...
Jun  6 05:12:02 localhost kernel: [ 7906.800726] CIFS VFS: Free previous 
auth_key.response = 000000003e802799
Jun  6 05:13:10 localhost kernel: [ 7975.657719] INFO: task java:1672 blocked 
for more than 120 seconds.
Jun  6 05:13:10 localhost kernel: [ 7975.657741]       Not tainted 
4.15.0-22-generic #24-Ubuntu
Jun  6 05:13:10 localhost kernel: [ 7975.657757] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  6 05:13:10 localhost kernel: [ 7975.657779] java            D    0  1672   
   1 0x80000000
Jun  6 05:13:10 localhost kernel: [ 7975.657781] Call Trace:
Jun  6 05:13:10 localhost kernel: [ 7975.657788]  __schedule+0x297/0x8b0
Jun  6 05:13:10 localhost kernel: [ 7975.657791]  ? __wake_up+0x13/0x20
Jun  6 05:13:10 localhost kernel: [ 7975.657792]  schedule+0x2c/0x80
Jun  6 05:13:10 localhost kernel: [ 7975.657795]  io_schedule+0x16/0x40
Jun  6 05:13:10 localhost kernel: [ 7975.657798]  
wait_on_page_bit_common+0xd8/0x160
Jun  6 05:13:10 localhost kernel: [ 7975.657800]  ? 
page_cache_tree_insert+0xe0/0xe0
Jun  6 05:13:10 localhost kernel: [ 7975.657801]  
__filemap_fdatawait_range+0xfa/0x160
Jun  6 05:13:10 localhost kernel: [ 7975.657803]  
filemap_write_and_wait+0x4d/0x90
Jun  6 05:13:10 localhost kernel: [ 7975.657826]  cifs_flush+0x43/0x90 [cifs]
Jun  6 05:13:10 localhost kernel: [ 7975.657830]  filp_close+0x2f/0x80
Jun  6 05:13:10 localhost kernel: [ 7975.657832]  __close_fd+0x85/0xa0
Jun  6 05:13:10 localhost kernel: [ 7975.657834]  SyS_close+0x23/0x50
Jun  6 05:13:10 localhost kernel: [ 7975.657836]  do_syscall_64+0x73/0x130
Jun  6 05:13:10 localhost kernel: [ 7975.657838]  
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Jun  6 05:13:10 localhost kernel: [ 7975.657839] RIP: 0033:0x7fd159dbd447
Jun  6 05:13:10 localhost kernel: [ 7975.657840] RSP: 002b:00007fd12880c440 
EFLAGS: 00000293 ORIG_RAX: 0000000000000003
Jun  6 05:13:10 localhost kernel: [ 7975.657841] RAX: ffffffffffffffda RBX: 
0000000000000175 RCX: 00007fd159dbd447
Jun  6 05:13:10 localhost kernel: [ 7975.657842] RDX: 0000000000000000 RSI: 
00000007c0023318 RDI: 0000000000000175
Jun  6 05:13:10 localhost kernel: [ 7975.657842] RBP: 00007fd12880c490 R08: 
0000000000000000 R09: 0000000781601e70
Jun  6 05:13:10 localhost kernel: [ 7975.657843] R10: 0000000000002288 R11: 
0000000000000293 R12: 00007fd158ccab40
Jun  6 05:13:10 localhost kernel: [ 7975.657844] R13: 00007fd0f000f1e8 R14: 
0000000000000042 R15: 00007fd12880c4a0
Jun  6 05:14:03 localhost kernel: [ 8028.649934] CIFS VFS: Server 
cifshost.example.com has not responded in 120 seconds. Reconnecting...
Jun  6 05:14:03 localhost kernel: [ 8028.652603] CIFS VFS: Free previous 
auth_key.response = 000000003e802799
Jun  6 05:15:11 localhost kernel: [ 8096.485698] INFO: task java:1672 blocked 
for more than 120 seconds.
Jun  6 05:15:11 localhost kernel: [ 8096.485721]       Not tainted 
4.15.0-22-generic #24-Ubuntu
Jun  6 05:15:11 localhost kernel: [ 8096.485737] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  6 05:15:11 localhost kernel: [ 8096.485758] java            D    0  1672   
   1 0x80000000
Jun  6 05:15:11 localhost kernel: [ 8096.485760] Call Trace:
Jun  6 05:15:11 localhost kernel: [ 8096.485769]  __schedule+0x297/0x8b0
Jun  6 05:15:11 localhost kernel: [ 8096.485772]  ? __wake_up+0x13/0x20
Jun  6 05:15:11 localhost kernel: [ 8096.485774]  schedule+0x2c/0x80
Jun  6 05:15:11 localhost kernel: [ 8096.485776]  io_schedule+0x16/0x40
Jun  6 05:15:11 localhost kernel: [ 8096.485779]  
wait_on_page_bit_common+0xd8/0x160
Jun  6 05:15:11 localhost kernel: [ 8096.485781]  ? 
page_cache_tree_insert+0xe0/0xe0
Jun  6 05:15:11 localhost kernel: [ 8096.485782]  
__filemap_fdatawait_range+0xfa/0x160
Jun  6 05:15:11 localhost kernel: [ 8096.485785]  
filemap_write_and_wait+0x4d/0x90
Jun  6 05:15:11 localhost kernel: [ 8096.485818]  cifs_flush+0x43/0x90 [cifs]
Jun  6 05:15:11 localhost kernel: [ 8096.485821]  filp_close+0x2f/0x80
Jun  6 05:15:11 localhost kernel: [ 8096.485823]  __close_fd+0x85/0xa0
Jun  6 05:15:11 localhost kernel: [ 8096.485824]  SyS_close+0x23/0x50
Jun  6 05:15:11 localhost kernel: [ 8096.485827]  do_syscall_64+0x73/0x130
Jun  6 05:15:11 localhost kernel: [ 8096.485829]  
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Jun  6 05:15:11 localhost kernel: [ 8096.485830] RIP: 0033:0x7fd159dbd447
Jun  6 05:15:11 localhost kernel: [ 8096.485831] RSP: 002b:00007fd12880c440 
EFLAGS: 00000293 ORIG_RAX: 0000000000000003
Jun  6 05:15:11 localhost kernel: [ 8096.485832] RAX: ffffffffffffffda RBX: 
0000000000000175 RCX: 00007fd159dbd447
Jun  6 05:15:11 localhost kernel: [ 8096.485833] RDX: 0000000000000000 RSI: 
00000007c0023318 RDI: 0000000000000175
Jun  6 05:15:11 localhost kernel: [ 8096.485834] RBP: 00007fd12880c490 R08: 
0000000000000000 R09: 0000000781601e70
Jun  6 05:15:11 localhost kernel: [ 8096.485834] R10: 0000000000002288 R11: 
0000000000000293 R12: 00007fd158ccab40
Jun  6 05:15:11 localhost kernel: [ 8096.485835] R13: 00007fd0f000f1e8 R14: 
0000000000000042 R15: 00007fd12880c4a0
Jun  6 05:16:05 localhost kernel: [ 8150.501876] CIFS VFS: Server 
cifshost.example.com has not responded in 120 seconds. Reconnecting...
Jun  6 05:16:05 localhost kernel: [ 8150.504330] CIFS VFS: Free previous 
auth_key.response = 000000000551ece7
Jun  6 05:17:01 localhost CRON[2077]: (root) CMD (   cd / && run-parts --report 
/etc/cron.hourly)
Jun  6 05:17:12 localhost kernel: [ 8217.313824] INFO: task java:1672 blocked 
for more than 120 seconds.
Jun  6 05:17:12 localhost kernel: [ 8217.313849]       Not tainted 
4.15.0-22-generic #24-Ubuntu
Jun  6 05:17:12 localhost kernel: [ 8217.313864] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  6 05:17:12 localhost kernel: [ 8217.313884] java            D    0  1672   
   1 0x80000000
Jun  6 05:17:12 localhost kernel: [ 8217.313887] Call Trace:
Jun  6 05:17:12 localhost kernel: [ 8217.313894]  __schedule+0x297/0x8b0
Jun  6 05:17:12 localhost kernel: [ 8217.313897]  ? __wake_up+0x13/0x20
Jun  6 05:17:12 localhost kernel: [ 8217.313899]  schedule+0x2c/0x80
Jun  6 05:17:12 localhost kernel: [ 8217.313901]  io_schedule+0x16/0x40
Jun  6 05:17:12 localhost kernel: [ 8217.313904]  
wait_on_page_bit_common+0xd8/0x160
Jun  6 05:17:12 localhost kernel: [ 8217.313906]  ? 
page_cache_tree_insert+0xe0/0xe0
Jun  6 05:17:12 localhost kernel: [ 8217.313908]  
__filemap_fdatawait_range+0xfa/0x160
Jun  6 05:17:12 localhost kernel: [ 8217.313910]  
filemap_write_and_wait+0x4d/0x90
Jun  6 05:17:12 localhost kernel: [ 8217.313931]  cifs_flush+0x43/0x90 [cifs]
Jun  6 05:17:12 localhost kernel: [ 8217.313934]  filp_close+0x2f/0x80
Jun  6 05:17:12 localhost kernel: [ 8217.313936]  __close_fd+0x85/0xa0
Jun  6 05:17:12 localhost kernel: [ 8217.313938]  SyS_close+0x23/0x50
Jun  6 05:17:12 localhost kernel: [ 8217.313940]  do_syscall_64+0x73/0x130
Jun  6 05:17:12 localhost kernel: [ 8217.313941]  
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Jun  6 05:17:12 localhost kernel: [ 8217.313943] RIP: 0033:0x7fd159dbd447
Jun  6 05:17:12 localhost kernel: [ 8217.313944] RSP: 002b:00007fd12880c440 
EFLAGS: 00000293 ORIG_RAX: 0000000000000003
Jun  6 05:17:12 localhost kernel: [ 8217.313945] RAX: ffffffffffffffda RBX: 
0000000000000175 RCX: 00007fd159dbd447
Jun  6 05:17:12 localhost kernel: [ 8217.313946] RDX: 0000000000000000 RSI: 
00000007c0023318 RDI: 0000000000000175
Jun  6 05:17:12 localhost kernel: [ 8217.313946] RBP: 00007fd12880c490 R08: 
0000000000000000 R09: 0000000781601e70
Jun  6 05:17:12 localhost kernel: [ 8217.313947] R10: 0000000000002288 R11: 
0000000000000293 R12: 00007fd158ccab40
Jun  6 05:17:12 localhost kernel: [ 8217.313948] R13: 00007fd0f000f1e8 R14: 
0000000000000042 R15: 00007fd12880c4a0

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1729337

Title:
  CIFS errors on 4.4.0-98, but not on 4.4.0-97 with same config

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1729337/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to