[Samba] smbd panic due to negative exclusive oplock count
Hi, I'm facing an issue with at least two Samba servers both showing important number of log entries like: smbd[27355]: PANIC: assert failed at smbd/oplock.c(149): exclusive_oplocks_open=0 Looking at the source code, this panic is triggered when the number of exclusive oplocks for a given file becomes negative. Once triggered for one smbd process, this process can generate more than a million log entries per day, causing both perf and space issues on server end, and poor perf on client end. When dying, the smbd process will log this final entry: smbd[27355]: respond_to_all_remaining_ local_messages: PANIC : we have -1 exclusive oplocks. This issue has been reproduced on both servers with versions 3.3.9 and 3.3.10 on RHEL 5.4 64bits, not tested with older ones. A level 10 log extract around a PANIC is like that: ... [2010/02/08 17:18:14, 5] smbd/uid.c:change_to_user(355) change_to_user uid=(0,6195) gid=(0,101) [2010/02/08 17:18:14, 3] smbd/reply.c:reply_close(4385) close fd=61 fnum=8800 (numopen=3) [2010/02/08 17:18:14, 6] smbd/close.c:set_close_write_time(450) close_write_time: Sun Feb 7 07:28:15 2106 [2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(100) Locking key 0BFD3E85 [2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(129) Allocated locked data 0x0x2b62c70b6450 [2010/02/08 17:18:14, 10] locking/locking.c:parse_share_modes(519) parse_share_modes: delete_on_close: 0, owrt: Sun Jan 10 18:14:29 2010 CET, cwrt: Thu Jan 1 01:00:00 1970 CET, tok: 0, num_share_modes: 1 [2010/02/08 17:18:14, 10] locking/locking.c:parse_share_modes(616) parse_share_modes: share_mode_entry[0]: pid = 5575, share_access = 0x3, private_options = 0x40, access_mask = 0x20089, mid = 0x0, type= 0x10, gen_id = 23508270, uid = 6195, flags = 0, file_id fd0b:5e853e [2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_record_destr(42) Unlocking key 0BFD3E85 [2010/02/08 17:18:14, 0] smbd/oplock.c:release_file_oplock(149) PANIC: assert failed at smbd/oplock.c(149): exclusive_oplocks_open=0 [2010/02/08 17:18:14, 10] locking/posix.c:get_windows_lock_ref_count(495) get_windows_lock_count for file XXX = 0 [2010/02/08 17:18:14, 10] locking/posix.c:delete_windows_lock_ref_count(521) delete_windows_lock_ref_count for file [2010/02/08 17:18:14, 2] smbd/close.c:close_normal_file(606) padcube closed file (numopen=2) NT_STATUS_OK [2010/02/08 17:18:14, 5] smbd/files.c:file_free(476) freed files structure 8800 (33 used) [2010/02/08 17:18:14, 5] lib/util.c:show_msg(645) ... This pattern looks the same for all PANICs, whatever the 'XXX' file the client is using. Any help would be greatly appreciated in order to better diagnose this issue and find a solution to it. Regards, Georges -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] smbd panic due to negative exclusive oplock count
On Sat, Feb 13, 2010 at 12:19:41PM +0100, Georges Brom wrote: This issue has been reproduced on both servers with versions 3.3.9 and 3.3.10 on RHEL 5.4 64bits, not tested with older ones. A level 10 log extract around a PANIC is like that: We need a *lot* more context to find this. Please open a bug at bugzilla.samba.org and upload a full debug level 10 log of smbd leading to this bug. Please note that the default max log size is far too small for this kind of debugging, please set it to 0 for this test. And also note that smbd logfiles due to their repetitive nature shrink a lot with bzip2 -9 :-) If you have privacy concerns, feel free to close this bug to be readable by the Samba-Team only, or alternatively send the logfile to me directly. If that is not enough, we need information how to reproduce the bug ourselves. Thanks, Volker signature.asc Description: Digital signature -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
[Samba] smbd panic due to negative exclusive oplock count
Hi, I'm facing an issue with at least two Samba servers in production both showing important number of log entries like: smbd[27355]: PANIC: assert failed at smbd/oplock.c(149): exclusive_oplocks_open=0 Looking at the source code, this panic is triggered when the number of exclusive oplocks for a given file becomes negative. Once triggered for one smbd process, this process can generate more than a million log entries per day, causing both perf and space issues on server end, and poor perf on client end. When dying, the smbd process will log this final entry: smbd[27355]: respond_to_all_remaining_local_messages: PANIC : we have -1 exclusive oplocks. This issue has been reproduced on both servers with versions 3.3.9 and 3.3.10 on RHEL 5.4 64bits, not tested with older ones. A level 10 log extract around a PANIC is like that: ... [2010/02/08 17:18:14, 5] smbd/uid.c:change_to_user(355) change_to_user uid=(0,6195) gid=(0,101) [2010/02/08 17:18:14, 3] smbd/reply.c:reply_close(4385) close fd=61 fnum=8800 (numopen=3) [2010/02/08 17:18:14, 6] smbd/close.c:set_close_write_time(450) close_write_time: Sun Feb 7 07:28:15 2106 [2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(100) Locking key 0BFD3E85 [2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_fetch_locked(129) Allocated locked data 0x0x2b62c70b6450 [2010/02/08 17:18:14, 10] locking/locking.c:parse_share_modes(519) parse_share_modes: delete_on_close: 0, owrt: Sun Jan 10 18:14:29 2010 CET, cwrt: Thu Jan 1 01:00:00 1970 CET, tok: 0, num_share_modes: 1 [2010/02/08 17:18:14, 10] locking/locking.c:parse_share_modes(616) parse_share_modes: share_mode_entry[0]: pid = 5575, share_access = 0x3, private_options = 0x40, access_mask = 0x20089, mid = 0x0, type= 0x10, gen_id = 23508270, uid = 6195, flags = 0, file_id fd0b:5e853e [2010/02/08 17:18:14, 10] lib/dbwrap_tdb.c:db_tdb_record_destr(42) Unlocking key 0BFD3E85 [2010/02/08 17:18:14, 0] smbd/oplock.c:release_file_oplock(149) PANIC: assert failed at smbd/oplock.c(149): exclusive_oplocks_open=0 [2010/02/08 17:18:14, 10] locking/posix.c:get_windows_lock_ref_count(495) get_windows_lock_count for file XXX = 0 [2010/02/08 17:18:14, 10] locking/posix.c:delete_windows_lock_ref_count(521) delete_windows_lock_ref_count for file [2010/02/08 17:18:14, 2] smbd/close.c:close_normal_file(606) padcube closed file (numopen=2) NT_STATUS_OK [2010/02/08 17:18:14, 5] smbd/files.c:file_free(476) freed files structure 8800 (33 used) [2010/02/08 17:18:14, 5] lib/util.c:show_msg(645) ... This pattern looks the same for all PANICs, whatever the 'XXX' file the client is using. Any help would be greatly appreciated in order to better diagnose this issue and find a solution to it. Regards, Georges -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba