Hello!

I have my FreeBSD-server dump nightly backups onto an entertainment device running embedded Linux.

The device has no NFS-server, but does run Samba (3.0.30). It allows access to its internal hard-drive, which my server mounts as:

   //dune/hdd750_..._32 /dune smbfs rw,noauto,-N,-Ekoi8-u:utf-8

There are two nightly cronjob using dump(8), xz(1), and ccrypt(1) to dump two "important" filesystems (/var/spool/imap and /home). The imap one kicks off at 3:11am and the home -- at 3:31am.

This normally works perfectly fine every night, except when somebody accidentally sits on top of the remote-control of the entertainment device in the living room -- or somehow else managed to turn the box off. When this happens, the first dump simply fails, as one would expect:

   cannot create /dune/backups/narawntapu.imap.1.Tuesday.dump.xz.cpt: No such 
file or directory
      DUMP: Date of this level 1 dump: Tue Mar 12 03:11:07 2013
      DUMP: Date of last level 0 dump: Wed Mar  6 01:31:07 2013
      DUMP: Dumping snapshot of /dev/da0a (/var/spool/imap) to standard output
      DUMP: mapping (Pass I) [regular files]
      DUMP: Cache 16 MB, blocksize = 65536
      DUMP: mapping (Pass II) [directories]
      DUMP: estimated 169895 tape blocks.
      DUMP: dumping (Pass III) [directories]
      DUMP: Broken pipe
      DUMP: The ENTIRE dump is aborted.

However, when the second job tries to do the same twenty minutes later, the machine panics. This morning I was able to get a kernel coredump:

   ...
   #6  0xffffffff80750f2f in calltrap () at
   /cache/src/sys/amd64/amd64/exception.S:228
   No locals.
   #7  0xffffffff805a46ca in turnstile_broadcast (ts=0x0, queue=0) at
   /cache/src/sys/kern/subr_turnstile.c:838
            _tid = <value optimized out>
            ts1 = <value optimized out>
            td = <value optimized out>
   #8  0xffffffff80550e52 in _mtx_unlock_sleep (m=0xfffffe0105ecd8f0,
   opts=<value optimized out>, file=<value optimized out>, line=<value
   optimized out>) at /cache/src/sys/kern/kern_mutex.c:715
            ts = (struct turnstile *) 0x0
   #9  0xffffffff8101a0cd in smb_iod_invrq (iod=<value optimized out>) at
   /cache/src/sys/modules/smbfs/../../netsmb/smb_iod.c:91
            rqp = (struct smb_rq *) 0xfffffe0105ecd800
   #10 0xffffffff8101b172 in smb_iod_addrq (rqp=0xfffffe0105ecd800) at
   /cache/src/sys/modules/smbfs/../../netsmb/smb_iod.c:418
            vcp = <value optimized out>
            iod = (struct smbiod *) 0xfffffe009483b800
            error = <value optimized out>
            __func__ = "uЪ", '\220' <repeats 12 times>
   #11 0xffffffff81017da2 in smb_rq_simple (rqp=0xfffffe0105ecd800) at
   /cache/src/sys/modules/smbfs/../../netsmb/smb_rq.c:168
            vcp = (struct smb_vc *) 0xfffffe011f957000
            error = <value optimized out>
            i = 0
   #12 0xffffffff81016202 in smb_smb_treeconnect (ssp=0xfffffe015f069200,
   scred=0xfffffe009483b868) at
   /cache/src/sys/modules/smbfs/../../netsmb/smb_smb.c:574
            vcp = (struct smb_vc *) 0xfffffe011f957000
            rq = {sr_state = 1720810032, sr_vc = 0xfffffe0002a8c490, sr_share =
   0xffffff8366917a90, sr_mid = 40352, sr_seqno = 4294967295, sr_rseqno =
   1720810112, sr_rq = {mb_top = 0xffffffff80574fea, mb_cur = 0x100000001,
   mb_mleft = 1458488464, mb_count = -512, mb_copy = 0xffffff8366917a80,
   mb_udata = 0xffffffff80755149}, sr_rqflags = 0 '\0', sr_rqflags2 = 0,
   sr_wcount = 0x0, sr_bcount = 0xffffff8366917ac0, sr_rp = {md_top =
   0xffffffff8057546d, md_cur = 0x0, md_pos = 0xfffffe0056eec490
   "\2005л\200ЪЪЪЪ"}, sr_rpgen = -1803307004, sr_rplast = -512, sr_flags =
   1458488464, sr_rpsize = -512, sr_cred = 0xfffffe009483b804, sr_timo =
   1458488464, sr_rexmit = -512, sr_sendcnt = 1720810208, sr_timesent = {tv_sec
   = 582, tv_nsec = -2196531595260}, sr_lerror = 0, sr_rqsig =
   0xffffff8366917b10
   "\200{\221f\203ЪЪЪ\206╚V\200ЪЪЪЪ\200{\221f\203ЪЪЪ\035є\001\201п\a", sr_rqtid
   = 0xffffffff805a0e97, sr_rquid = 0xffffff8366917b10, sr_errclass = 1 '\001',
   sr_serror = 0, sr_error = 0, sr_rpflags = 208 'п', sr_rpflags2 = 0, sr_rptid
   = 0, sr_rppid = 0, sr_rpuid = 0, sr_rpmid = 0, sr_slock = {lock_object =
   {lo_name = 0xffffff8366917b80
   "Ю{\221f\203ЪЪЪ\032ґ\001\201ЪЪЪЪП{\221f\203ЪЪЪ\230╦\203\224", lo_flags =
   2153163654, lo_data = 4294967295, lo_witness = 0xffffff8366917b80}, mtx_lock
   = 8592098960413}, sr_t2 = 0xffffffff8102517c, sr_link = {tqe_next =
   0x9483b820, tqe_prev = 0x0}}
            rqp = (struct smb_rq *) 0xfffffe0105ecd800
            mbp = (struct mbchain *) 0xfffffe0105ecd828
            pp = <value optimized out>
            pbuf = 0x0
            encpass = 0x0
            error = <value optimized out>
            plen = 1
            upper = 0
   #13 0xffffffff8101ad1a in smb_iod_thread (arg=<value optimized out>) at
   /cache/src/sys/modules/smbfs/../../netsmb/smb_iod.c:206
            iod = (struct smbiod *) 0xfffffe009483b800
   #14 0xffffffff805365df in fork_exit (callout=0xffffffff8101aa83
   <smb_iod_thread>, arg=0xfffffe009483b800, frame=0xffffff8366917c40) at
   /cache/src/sys/kern/kern_fork.c:992
            p = (struct proc *) 0xfffffe0181104000
            td = (struct thread *) 0xfffffe0056eec490
   #15 0xffffffff8075145e in fork_trampoline () at
   /cache/src/sys/amd64/amd64/exception.S:602

Looking inside the smb_iod_invrq (smb_iod.c:91), I'm wondering, if an attempt is made to invalidate/release something twice (causing the turnstile_broadcast() to be invoked with ts being NULL the second time)? That would explain, why the first attempt to use the absent server errors-out as normal, and only the second attempt panics.

My kernel is 9.1-PRERELEASE as of Dec 19. Any ideas? Thanks! Yours,

   -mi

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to