On Tue, May 03, 2022 at 07:52:16PM +0200, Alexander Bluhm wrote:
> On Mon, May 02, 2022 at 06:53:08AM +0200, Sebastien Marie wrote:
> > New diff, with new iteration on vnode_history_*() functions. I added a 
> > label in
> > the record function. I also changed when showing the stacktrace. powerpc has
> > poor backtrace support, but now it will at least print some infos even if no
> > stacktrace recorded (pid, ps_comm, label, v_usecount).
> >
> > I added a vnode_history_record() entry inside vclean(). We should have it in
> > both case:
> > - if vnode is already inactive (vp->v_usecount == 0) [trace already 
> > recorded]
> > - if vnode is still active (vp->v_usecount != 0) [trace previously not 
> > recorded]
> 
> http://bluhm.genua.de/release/results/2022-05-02T12%3A23%3A22Z/bsdcons-ot26.txt
> 
> [-- MARK -- Tue May  3 16:45:00 2022]
> uvn_io: start: 0x23b5a9c0, type VREG, use 0, write 0, hold 0, flags 
> (VBIOONFREELIST)
>       tag VT_UFS, ino 365327, on dev 0, 10 flags 0x100, effnlink 1, nlink 1
>       mode 0100660, owner 21, group 21, size 13647873
> ==> vnode_history_print 0x23b5a9c0, next=6
>  [0] c++[83314] vget, usecount 1>2
> #0  mtx_enter_try+0x5c
>  [1] c++[83314] vget, usecount 2>3
> #0  mtx_enter_try+0x5c
>  [2] c++[83314] vrele, usecount 3>2
> <empty stack trace>
>  [3] c++[83314] vput, usecount 2>1
> <empty stack trace>
>  [4] reaper[690] uvn_detach (UVM_VNODE_CANPERSIST), usecount 1>1
> #0  0xfffffffc
> #1  umidi_quirklist+0x148
> #2  uvn_detach+0x13c
> #3  uvm_unmap_detach+0x1a4
> #4  uvm_map_teardown+0x184
> #5  uvmspace_free+0x60
> #6  uvm_exit+0x30
> #7  reaper+0x138
> #8  fork_trampoline+0x14
>  [5] reaper[690] vrele, usecount 1>0
> <empty stack trace>
>  [6>] c++[73867] vget, usecount 1>2
> #0  mtx_enter_try+0x5c
>  [7] c++[73867] vrele, usecount 2>1
> <empty stack trace>
>  [8] c++[73867] vref, usecount 1>2
> <empty stack trace>
>  [9] c++[73867] vput, usecount 2>1
> <empty stack trace>

We have a bit more elements in the trace (because we don't rely anymore on 
stacktrace to show elements or not, and on powerpc stacktrace seems somehow 
creative).

we are recording inside a ring. the next entry would be 6 (next=6), so 6 is the 
older entry in the ring.

Reordered history is:
[6] c++[73867] vget, usecount 1>2
[7] c++[73867] vrele, usecount 2>1
[8] c++[73867] vref, usecount 1>2
[9] c++[73867] vput, usecount 2>1
[0] c++[83314] vget, usecount 1>2
[1] c++[83314] vget, usecount 2>3
[2] c++[83314] vrele, usecount 3>2
[3] c++[83314] vput, usecount 2>1
[4] reaper[690] uvn_detach (UVM_VNODE_CANPERSIST), usecount 1>1
[5] reaper[690] vrele, usecount 1>0

and next, we paniced in pagedaemon:

(from ps)
*93511  246215      0      0  7     0x14200                pagedaemon

neither pid 73867 and 83314 are still alive at panic time (no entries in ps).

we don't have any vclean label ("vclean (inactive)" or "vclean (active)"), so 
vclean() was not called in this timeframe.

the two last entries seems to be from uvn_detach(), the code path is:

        void
        uvn_detach(struct uvm_object *uobj)
        {
                rw_enter(uobj->vmobjlock, RW_WRITE);
                uobj->uo_refs--;

                vp->v_flag &= ~VTEXT;

                if (uvn->u_flags & UVM_VNODE_CANPERSIST) {
                        vnode_history_record("uvn_detach 
(UVM_VNODE_CANPERSIST)", vp, 0);
                                /* v_usecount=1 */

                        /* won't block */
                        uvn_flush(uobj, 0, 0, PGO_DEACTIVATE|PGO_ALLPAGES);
                        goto out;
                }

        out:
                rw_exit(uobj->vmobjlock);
                vrele(vp);
        }

        int
        vrele(struct vnode *vp)
        {
                vp->v_usecount--;
                vnode_history_record("vrele", vp, VH_DECR);
                        /* v_usecount: 1 -> 0 */

                /* vp->v_usecount == 0 */
                /* vp->v_writecount == 0 */
                
                if (vn_lock(vp, LK_EXCLUSIVE | LK_IOP)) {
                        /* not here: we don't have vprint("vrele: cannot lock") 
in console */
                }

                VOP_INACTIVE(vp, p);

                if (vp->v_usecount == 0 && !(vp->v_bioflag & VBIOONFREELIST))
                        vputonfreelist(vp);

                return (1);
        }

VOP_INACTIVE() will unlock the vnode. the vnode is VT_UFS, vop_inactive is 
ufs_inactive().

        int
        ufs_inactive(void *v)
        {
                if (ip->i_din1 == NULL || DIP(ip, mode) == 0)
                        /* branch not taken: we would ends in vclean() [via 
vrecycle() in out]
                           and it isn't the case */
                        goto out;

                if (DIP(ip, nlink) <= 0 && (vp->v_mount->mnt_flag & MNT_RDONLY) 
== 0) {
                        /* branch not taken: the branch would set
                           DIP_ASSIGN(ip, mode, 0)
                           and next we would ends in vclean() [via vrecycle() 
below]
                           and it isn't the case. */

                        /* additionnally, at panic-time, nlink == 1 */
                }

                if (ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_MODIFIED | 
IN_UPDATE)) {
                        /* unknow if the branch has been executed or not */
                        UFS_UPDATE(ip, 0);
                }

        out:
                VOP_UNLOCK(vp);

                if (ip->i_din1 == NULL || DIP(ip, mode) == 0)
                        /* branch not taken: we would ends in vclean() and it 
isn't the case */
                        vrecycle(vp, ap->a_p);

                return (0);
        }


at panic-time, the vnode has VBIOONFREELIST flag, it could have been set by 
vputonfreelist() call inside vrele().


> ddb{0}> trace
> db_enter() at db_enter+0x20
> panic(9466dc) at panic+0x158
> vn_lock(7c564249,2b7000) at vn_lock+0x1c4
> uvn_io(ae78a0,1,e7ebbc50,8650f8,e4010000) at uvn_io+0x254
> uvn_put(e5274130,e7ebbdd4,23b952a0,4f1f9f0) at uvn_put+0x64
> uvm_pager_put(0,0,e7ebbd70,267900,2000000,80000000,0) at uvm_pager_put+0x15c
> uvmpd_scan_inactive(0) at uvmpd_scan_inactive+0x224
> uvmpd_scan() at uvmpd_scan+0x15c
> uvm_pageout(19350f3) at uvm_pageout+0x398
> fork_trampoline() at fork_trampoline+0x14
> end trace frame: 0x0, count: -10
> 
> ddb{0}> ps
>    PID     TID   PPID    UID  S       FLAGS  WAIT          COMMAND
>  89786  339124  18956     21  7         0x2                c++
>  18956  132327  16031     21  3    0x10008a  sigsusp       sh
>  69984  204913  18196     21  2         0x2                c++
>  18196   68139  16031     21  3    0x10008a  sigsusp       sh
>  16031  461704  55284     21  3    0x10008a  sigsusp       make
>  55284  387860  26131     21  3    0x10008a  sigsusp       sh
>  26131  129701   8638     21  3    0x10008a  sigsusp       make
>   8638    5723  93131     21  3    0x10008a  sigsusp       sh
>  93131  150437  70656     21  3    0x10008a  sigsusp       make
>  70656  356318  77649     21  3    0x10008a  sigsusp       sh
>  77649  417086  69384     21  3    0x10008a  sigsusp       make
>  69384  106821  41029     21  3    0x10008a  sigsusp       sh
>  41029   33801  47303     21  3    0x10008a  sigsusp       make
>  47303  456448  54946      0  3    0x10008a  sigsusp       sh
>  54946   79169  65610      0  3    0x10008a  sigsusp       make
>  65610  197678   3781      0  3    0x10008a  sigsusp       make
>   3781   34391  24626      0  3    0x10008a  sigsusp       sh
>  24626   44883  83499      0  3        0x82  piperd        perl
>  83499  489078  31556      0  3    0x10008a  sigsusp       ksh
>  31556  300547  31500      0  3        0x9a  kqread        sshd
>  30342  257360      1      0  3    0x100083  ttyin         getty
>  22119   70750      1      0  3    0x100098  kqread        cron
>  18055  372758      1     99  3   0x1100090  kqread        sndiod
>  15008  447360      1    110  3    0x100090  kqread        sndiod
>  13426   88175  71372     95  3   0x1100092  kqread        smtpd
>  69885  195544  71372    103  3   0x1100092  kqread        smtpd
>  21127  121140  71372     95  3   0x1100092  kqread        smtpd
>  45089  378581  71372     95  3    0x100092  kqread        smtpd
>  25644   44788  71372     95  3   0x1100092  kqread        smtpd
>  91476  174006  71372     95  3   0x1100092  kqread        smtpd
>  71372  398158      1      0  3    0x100080  kqread        smtpd
>  42333  286401      1      0  3    0x100080  kqread        snmpd
>  96410  279792      1     91  3   0x1000092  kqread        snmpd
>  31500  218012      1      0  3        0x88  kqread        sshd
>  93289  359805      0      0  3     0x14280  nfsidl        nfsio
>  97871  419910      0      0  3     0x14280  nfsidl        nfsio
>  42572  230238      0      0  3     0x14280  nfsidl        nfsio
>  76808  507862      0      0  3     0x14280  nfsidl        nfsio
>  71633  434281      1      0  3    0x100080  kqread        ntpd
>  91354   47535  29153     83  3    0x100092  kqread        ntpd
>  29153  124045      1     83  3   0x1100092  kqread        ntpd
>  79695   44418  51744     74  3   0x1100092  bpf           pflogd
>  51744  381692      1      0  3        0x80  netio         pflogd
>  55907  151738  60426     73  3   0x1100090  kqread        syslogd
>  60426   76092      1      0  3    0x100082  netio         syslogd
>  88148  388608      1      0  3    0x100080  kqread        resolvd
>  27342  343514  35925     77  3    0x100092  kqread        dhcpleased
>  91544  385441  35925     77  3    0x100092  kqread        dhcpleased
>  35925  410197      1      0  3        0x80  kqread        dhcpleased
>  69974  505453  29200    115  3    0x100092  kqread        slaacd
>  21402  354493  29200    115  3    0x100092  kqread        slaacd
>  29200  404409      1      0  3    0x100080  kqread        slaacd
>  75650  317973      0      0  3     0x14200  bored         smr
>  13921   79526      0      0  3  0x40014200                idle1
>  89472  223474      0      0  2     0x14200                zerothread
>  29989  442095      0      0  3     0x14200  aiodoned      aiodoned
>  80478  197690      0      0  3     0x14200  syncer        update
>  93421  417382      0      0  3     0x14200  cleaner       cleaner
>    690  224430      0      0  3     0x14200  reaper        reaper
> *93511  246215      0      0  7     0x14200                pagedaemon
>   4197  277498      0      0  3     0x14200  usbtsk        usbtask
>  36475  273892      0      0  3     0x14200  usbatsk       usbatsk
>  13924   79914      0      0  3     0x14200  bored         sensors
>  36337   12438      0      0  3     0x14200  bored         softnet
>  76457  523170      0      0  3     0x14200  bored         systqmp
>   1779   97134      0      0  2     0x14200                systq
>  49928  343720      0      0  3  0x40014200  bored         softclock
>  26289  252465      0      0  3  0x40014200                idle0
>      1  472436      0      0  3        0x82  wait          init
>      0       0     -1      0  3     0x10200  scheduler     swapper
> 


-- 
Sebastien Marie

Reply via email to