[akaros] Re: [PATCH] x86: Flush the VMCS when changing owning_proc

Barret Rhoden Fri, 13 Jan 2017 08:08:30 -0800

On 2017-01-13 at 10:52 Barret Rhoden <[email protected]> wrote:
> Given all this, here was the bug that caused this.


for those curious about the debugging steps (including false trails)
that i took figuring this out, i attached my 12-monkeys-style scribbles.

barret

-- 
You received this message because you are subscribed to the Google Groups 
"Akaros" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Possibly related, while running a VM and doing weird shit with ssh:

Type 'help' for a list of commands.
HW TRAP frame at 0xfffffff00021ee60 on core 13
  rax  0xfffffff00021ef28
  rbx  0xffffffffc54d9248
  rcx  0x0000000000000001
  rdx  0x0000000092b24000
  rbp  0xfffffff00021ef58
  rsi  0xffff800092b24000
  rdi  0x0000000000000000
  r8   0x0000000000001d40
  r9   0xfffffff00021eec0
  r10  0x0000000000000018
  r11  0xfffffff00021eee0
  r12  0xffff8000061ca520
  r13  0xfffffff00021ef28
  r14  0x00000000ffffffff
  r15  0x0000000000000000
  trap 0x00000006 Invalid Opcode
  gsbs 0xffffffffc5620e00
  fsbs 0x0000000000000000
  err  0x--------00000000
  rip  0xffffffffc2135dab
  cs   0x------------0008
  flag 0x0000000000010003
  rsp  0xfffffff00021ef28
  ss   0x------------0010
ROS(Core 13)> 
ROS(Core 13)> bt
Stack Backtrace on Core 13:
#01 [<0xffffffffc201ecd4>] in mon_backtrace
#02 [<0xffffffffc201fcd7>] in monitor
#03 [<0xffffffffc2130d11>] in trap
ROS(Core 13)> bt 0xffffffffc2135dab 0xfffffff00021ef58
Backtrace from instruction 0xffffffffc2135dab, with frame pointer 
0xfffffff00021ef58

Backtrace of kernel context on Core 13:
#01 [<0xffffffffc2135dab>] in vmx_clear_vmcs
#02 [<0xffffffffc212e4c0>] in __abandon_core
#03 [<0xffffffffc2051532>] in proc_restartcore

that's this ud2:
static inline void __invept(int ext, uint64_t eptp, gpa_t gpa)
{                                                                     
    struct {                           
        uint64_t eptp, gpa;                               
    } operand = {eptp, gpa};               
                                                                    
    asm volatile (ASM_VMX_INVEPT                         
            /* CF==1 or ZF==1 --> rc = -1 */                                    
  
            "; ja 1f ; ud2 ; 1:\n"                                    
            : : "a" (&operand), "c" (ext) : "cc", "memory");     
}                                                                   

ffffffffc213609c:   b9 01 00 00 00          mov    $0x1,%ecx
ffffffffc21360a1:   4c 89 e8                mov    %r13,%rax
ffffffffc21360a4:   66 0f 38 80 08          invept (%rax),%rcx


so both of these still happen (strace triggers the panic from unload guest 
pcore), and i can get the ud2
        strace/perf doesn't always trigger it.
        ud2 was hard.  seems more reliable if we're spinning in the VM and then 
we do something (like ctrl-c it)
                for i in `seq 1 99999999999999`; do echo foo > /dev/null; done
                        - this does it reliably

ud2: 
        vmx_clear_vmcs should only clear if it thinks it has a GPC.  that's 
probably the fucked up part.
                that came from the PERCPU_VAR(gpc-to-clear-to)
        the __invept triggers a UD2, probably when one of those flags are set.
        rfl is 0x0000000000010003
                CR, RSVD, RF (resume)
                        CF is set, which means FailInvalid
                rax 0xfffffff000113f28 (rcx == 1)

                ROS(Core 12)> kfunc hexdump 0xfffffff000113f28 128
                fffffff000113f28: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00  ................
                fffffff000113f38: 0c 00 00 00 f0 ff ff ff 0c 00 00 00 00 00 00 
00  ................
                fffffff000113f48: 11 00 00 00 00 00 00 00 68 2f 00 00 00 10 00 
00  ........h/......
                fffffff000113f58: 78 3f 11 00 f0 ff ff ff 90 e7 12 c2 ff ff ff 
ff  x?..............
                fffffff000113f68: ff ff ff ff 00 00 00 00 c0 fb cc c7 ff ff ff 
ff  ................
                fffffff000113f78: 98 3f 11 00 f0 ff ff ff c2 16 05 c2 ff ff ff 
ff  .?..............
                fffffff000113f88: b0 e8 9f ff 7f 7f 00 00 b0 e8 9f ff 7f 7f 00 
00  ................
                fffffff000113f98: 00 00 00 00 00 00 00 00 b0 39 13 c2 ff ff ff 
ff  .........9......

                note that is a stack addr (0xfffffff000000000)
                RSP is 0xfffffff000113f28
                rax is 0xfffffff000113f28
                
                invept passes eptp and gpa as a struct on the stack, addr is in 
rax
                first thing was the arg passed in (rax in the asm, so only in 
memory), then the gpa, which happens to be 0
                based on the asm, it looks like we passed null eptp

                null eptp, gpc was 0xffff800008d83c20
                null eptp, proc was 0xffff800007ece200
                null eptp, pid was 275
                null eptp, PCPU was 0xffff800008d83c20

                that proc is vmrunkernel.  we're looking for p->pgdir.eptp.  
pgdir is embedded.  did we somehow set it to 0 when killing the process, but 
before decrefs?
                        that happens in __proc_free, which shouldn't happen 
until all refs are in
                        do we decref before flushing ept?
                        no - we hold our ref until after clearing vmcs.  unless 
some weird shit is going on.
                        refcounting bug?
                                maybe we're storing refs in weird places with 
vms / gpcs, got confused, and decreffed too many times.  if we hadn't hit the 
ud2, we'd panic in kref
                                yes - the ref is already 0.  so someone fucked 
up and decreffed the proc too much.
                                        - makes sense that this happens when 
we're killing the VM/process

        k, so we have someone calling __abandon_core(pcpui->cur_proc), but the 
ref is 0.
                BT said it was proc_restartcore, which calls it when we don't 
have an owning proc.
                         could happen if we unblocked a thread on a core the 
process wasn't running on, completed the kthread/syscall, then tried to idle.

        is this a VM problem, or can any process (like an MCP spinning?) 
trigger this?
                pthread_test didn't easily trigger it
                might need to use some other services that vms use?
                that ctrl-c is sent to DB, which does the killing.
        
        the core that panics is an MCP core - could be any of them.
                probably was the core that was spinning in the guest
                it traps, takes the __death message, which finalizes the 
context, clears owning proc, then returns via restartcore.  that abandons, 
since no owning core.
                        finalize -> unload_guest_pcore
                        
                        XXX hey - if this is what happened, then what did 
unload_guest_pcore from __death do?  remove the GPC?
                                - no, with lazy unload, we clear some MSRs, but 
the VMCS isn't unloaded. 
                                - the pcore->gpc = -1.  the gpc->core_id() is 
our core, our core's PCPU is set to GPC.
                                - later, we call __abandon_core.  that sees the 
GPC (from PCPU), clears the VMCS, and attempts to clear EPT, but our refcnt is 
already too low.

        proc_decref isn't called in VM code - hardly any arch specific code
                given that, there might be a refcnt bug elswhere?  like 
something with cores coming and going, kthreads restarting on top of other 
processes, etc
                though having it spin seems critical to trigger it.
                does that imply a clobber?  if someone decrefs too much, they 
usually decref their ref.  but for something like cur_proc, maybe we overwrite 
it?
                        cur_proc:  written in proc64.c, kthread.c, proc.c.  
also look for 'current' (mon.c too)
                maybe some interaction with switch_to and other code, before 
switch_back?  kth_save_addr_space?
                        what if i'm told to abandon_core during a 
switch_to/switch_back?
                                switch_to, old p is returned, ref for new p is 
stored in cur_proc
                                block or something.  (kth_save?)
                                        kth should get a refcnt'd copy of new_p
                                new_p is cur
                                abandon, decref new_p.  now the new_p ref is in 
the kth
                                unblock.  load as current, shouldn't incref, 
since we transfer the ref from the kth.
                                switch_back
                                        return new_p, restore old_p.
                                seems OK.
                                        unless kth_save_addr wasn't on (for 
non-ktask)
                is it possible to have an errant kthread->proc set?  if we have 
an uncounted ref there, but treat it as counted, etc...
                        looks like no.  we're pretty good about zeroing it

                last time we had this, i did some trace_printks for each decref 
source.
                        there was a shitload of stuff to wade through.  prob 
esp bad the longer a process runs.
                        maybe turn it on an off?  like right before hitting 
ctl-c?
                        maybe make sure we didn't lose the ref already? (pip is 
racy btw)
                        maybe a different tool?  one that pairs the ref with 
its source?

        does it happen if it dies from the shell?
                No...
                which makes #cons more suspicious
                that shit looks OK though.
                we aren't actually killing from there - we send sigint via 
event...
                        the diff is the process tries to kill itself via 
proc_destroy, possibly on another core.  though also maybe on the core that was 
spinning
                                wake up, vc ctx, see the handler, do a 
sys_proc_destroy
                        doesn't happen if we kill from the kernel
                        race with the process killing?
                        you can trigger it with kill from bash (signal)
                                so it seems like the process killing itself, 
possibly from a sig handler, matters
                                does the constant start/stop of vcores also 
matter?  (could be the racy part.  but also we always fail, not just on 
occasion).
                                        maybe - pthread_test didn't trigger it.
                                        the VM might matter, due to the VMCS 
unload stuff, but OTOH, this is a refcnt problem, only triggered by the invept
                                #alarm uses refcnts, and we're often doing 
things with #alarm with the VMM 2LS.
                if you're in the monitor, (trace syscalls) you have to leave, 
so ssh/DB can do its thing (obviously)
                        but doesn't die via ssh ctrl-c if verbose syscall 
tracing is on
                                implies there is a race where printing saves it
                                sometimes it does - at least with a quick trace 
syscall start.
                        this trace is from launching kill 276 (vmrunkernel, 
which was on vcore 2, pcore 5 right before (might have moved))

E [5518933.526522648]-[      0.000000000] Syscall 101 (       write):(0xf, 
0x400001856e0f, 0x13, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 da'
X [5518933.526522648]-[5518933.543441149] Syscall 101 (       write):(0xf, 
0x400001856e0f, 0x13, 0x0, 0x0, 0x0) ret: 0x13 proc: 276 core: 5 vcore: 2 data: 
'0x002f0'
        resetting an alarm
E [5518933.560444592]-[      0.000000000] Syscall  30 ( poke_ksched):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 data: ''
X [5518933.560444592]-[5518933.573971249] Syscall  30 ( poke_ksched):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 5 vcore: 2 data: ''
        probably asked for more cores
E [5518933.573993291]-[      0.000000000] Syscall  38 (vmm_poke_guest):(0x0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 14 vcore: 0 data: ''
        inject IRQ, on new core
E [5518933.587522703]-[      0.000000000] Syscall 101 (       write):(0xf, 
0x400001856eff, 0x13, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 data: 
'0x000000'
        alarm reset, old core
X [5518933.573993291]-[5518933.601315825] Syscall  38 (vmm_poke_guest):(0x0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 14 vcore: 0 data: ''
        irq inject completed (might have sent an IPI to pcore 5, or wherever 
GPC 0 was)
X [5518933.587522703]-[5518933.618227163] Syscall 101 (       write):(0xf, 
0x400001856eff, 0x13, 0x0, 0x0, 0x0) ret: 0x13 proc: 276 core: 5 vcore: 2 data: 
'0x00000'
        alarm reset completed
E [5518933.632019549]-[      0.000000000] Syscall   2 (       block):(0x186a0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 14 vcore: 0 data: ''
        some sort of timer sleep.  would block and return to VC ctx
E [5518933.649028297]-[      0.000000000] Syscall  37 (     
pop_ctx):(0x7f7fffa03800, 0x0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 
vcore: 2 data: ''
        trying to enter the VM (pcore 5)
E [5518933.662998606]-[      0.000000000] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 14 vcore: 0 data: ''
        the timer sleep vc returned and now yields
X [5518933.649028297]-[5518933.677488263] Syscall  37 (     
pop_ctx):(0x7f7fffa03800, 0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 27e6 core: 5 
vcore: 2 data: ''
X [5518933.662998606]-[5518933.691111915] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 14 vcore: 0 data: ''

X [5518933.632019549]-[5518933.763017413] Syscall   2 (       block):(0x186a0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 14 vcore: 0 data: ''
        the timer woke up.
E [5518933.776992600]-[      0.000000000] Syscall 101 (       write):(0xf, 
0x400001856e0f, 0x13, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 data: 
'0x002f07'
X [5518933.776992600]-[5518933.793907500] Syscall 101 (       write):(0xf, 
0x400001856e0f, 0x13, 0x0, 0x0, 0x0) ret: 0x13 proc: 276 core: 5 vcore: 2 data: 
'0x002f0'
        PC 5 probably got an event for the completion of sys_block(), then 
decided to set its alarm
E [5518933.810901923]-[      0.000000000] Syscall  30 ( poke_ksched):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 data: ''
X [5518933.810901923]-[5518933.824437943] Syscall  30 ( poke_ksched):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 5 vcore: 2 data: ''
        then asked for another vcore
E [5518933.824460692]-[      0.000000000] Syscall  38 (vmm_poke_guest):(0x0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 9 vcore: 0 data: ''
        here's the other vcore - doing a poke.  this is probably the timer 
inject thread
E [5518933.837985832]-[      0.000000000] Syscall 101 (       write):(0xf, 
0x400001856eff, 0x13, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 data: 
'0x000000'
X [5518933.824460692]-[5518933.851696265] Syscall  38 (vmm_poke_guest):(0x0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 9 vcore: 0 data: ''
X [5518933.837985832]-[5518933.868614011] Syscall 101 (       write):(0xf, 
0x400001856eff, 0x13, 0x0, 0x0, 0x0) ret: 0x13 proc: x276 core: 5 vcore: 2 
data: '0x0000'
E [5518933.882320368]-[      0.000000000] Syscall   2 (       block):(0x186a0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 9 vcore: 0 data: ''
E [5518933.899411003]-[      0.000000000] Syscall  37 (     
pop_ctx):(0x7f7fffa03800, 0x0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 
vcore: 2 data: ''
E [5518933.913292787]-[      0.000000000] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 9 vcore: 0 data: ''
X [5518933.899411003]-[5518933.927781679] Syscall  37 (     
pop_ctx):(0x7f7fffa03800, 0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 5 
vcore: 2 data: ''
X [5518933.913292787]-[5518933.941317484] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 9 vcore: 0 data: ''
        timer thread blocked again, VC yielded, PC 5 went back to running the 
guest


iX [5518933.882320368]-[5518934.013312910] Syscall   2 (       block):(0x186a0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 9 vcore: 0 data: ''
an errant 'i' in the output.  i might have been trying to type "exit\n\n\n", to 
get out of the monitor and finish the "kill 276".  there's a t and \ns below.
E [5518934.027196521]-[      0.000000000] Syscall 101 (       write):(0xf, 
0x400001856e0f, 0x13, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 data: 
'0x002f07'
X [5518934.027196521]-[5518934.044111341] Syscall 101 (       write):(0xf, 
0x400001856e0f, 0x13, 0x0, 0x0, 0x0) ret: 0x13 proc: 276 core: 5 vcore: 2 data: 
'0x002f0'
E [5518934.061108604]-[      0.000000000] Syscall  30 ( poke_ksched):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 data: ''
X [5518934.061108604]-[5518934.074644789] Syscall  30 ( poke_ksched):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 5 vcore: 2 data: ''
E [5518934.074666776]-[      0.000000000] Syscall  38 (vmm_poke_guest):(0x0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 6 vcore: 0 data: ''
tE [5518934.088183109]-[      0.000000000] Syscall 101 (       write):(0xf, 
0x400001856eff, 0x13, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 data: 
'0x00000'
X [5518934.074666776]-[5518934.101893311] Syscall  38 (vmm_poke_guest):(0x0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 6 vcore: 0 data: ''
X [5518934.088183109]-[5518934.118887777] Syscall 101 (       write):(0xf, 
0x400001856eff, 0x13, 0x0, 0x0, 0x0) ret: 0x13 proc: 276 core: 5 vcore: 2 data: 
'0x00000'
E [5518934.132586014]-[      0.000000000] Syscall   2 (       block):(0x186a0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 6 vcore: 0 data: ''
E [5518934.149588418]-[      0.000000000] Syscall  37 (     
pop_ctx):(0x7f7fffa03800, 0x0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 
vcore: 2 data: ''
E [5518934.163473611]-[      0.000000000] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 6 vcore: 0 data: ''
X [5518934.149588418]-[5518934.177957474] Syscall  37 (     
pop_ctx):(0x7f7fffa03800, 0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 5 
vcore: 2 data: ''
X [5518934.163473611]-[5518934.191487899] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 6 vcore: 0 data: ''

X [5518934.132586014]-[5518934.263492588] Syscall   2 (       block):(0x186a0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 6 vcore: 0 data: ''
E [5518934.277377040]-[      0.000000000] Syscall 101 (       write):(0xf, 
0x400001856e0f, 0x13, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 data: 
'0x002f07'

X [5518934.277377040]-[5518934.294298482] Syscall 101 (       write):(0xf, 
0x400001856e0f, 0x13, 0x0, 0x0, 0x0) ret: 0x13 proc: 276 core: 5 vcore: 2 data: 
'0x002f0'
E [5518934.311901351]-[      0.000000000] Syscall  30 ( poke_ksched):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 data: ''
E [5518934.320178047]-[      0.000000000] Syscall  38 (vmm_poke_guest):(0x0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 13 vcore: 0 data: ''
X [5518934.311901351]-[5518934.325430378] Syscall  30 ( poke_ksched):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 5 vcore: 2 data: ''
X [5518934.320178047]-[5518934.339217278] Syscall  38 (vmm_poke_guest):(0x0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 13 vcore: 0 data: ''
E [5518934.352745834]-[      0.000000000] Syscall 101 (       write):(0xf, 
0x400001856eff, 0x13, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 data: 
'0x000000'
E [5518934.366530580]-[      0.000000000] Syscall   2 (       block):(0x186a0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 13 vcore: 0 data: ''
X [5518934.352745834]-[5518934.383440612] Syscall 101 (       write):(0xf, 
0x400001856eff, 0x13, 0x0, 0x0, 0x0) ret: 0x13 proc: 276 core: 5 vcore: 2 data: 
'0x00000'
E [5518934.397416981]-[      0.000000000] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 13 vcore: 0 data: ''
E [5518934.414410718]-[      0.000000000] Syscall  37 (     
pop_ctx):(0x7f7fffa03800, 0x0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 
vcore: 2 data: ''
X [5518934.397416981]-[5518934.428022642] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 13 vcore: 0 data: ''
X [5518934.414410718]-[5518934.442506898] Syscall  37 (     
pop_ctx):(0x7f7fffa03800, 0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 5 
vcore: 2 data: ''
        another loop of timerthread done

E [5518934.467562753]-[      0.000000000] Syscall 101 (       write):(0x2, 
0x400000558000, 0x1, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: 
'\012'

X [5518934.467562753]-[5518934.486022838] Syscall 101 (       write):(0x2, 
0x400000558000, 0x1, 0x0, 0x0, 0x0) ret: 0x1 proc: 270 core: 0 vcore: 0 data: 
'\012'
        PID 270 here.  that's the shell that DB launched.  see some TC shit 
later.

X [5518934.366530580]-[5518934.497435556] Syscall   2 (       block):(0x186a0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 13 vcore: 0 data: ''
        VM timerthread waking up
E [5518934.500859799]-[      0.000000000] Syscall 142 (   tcsetattr):(0x0, 0x1, 
0x6e08a0, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
E [5518934.514814507]-[      0.000000000] Syscall 101 (       write):(0xf, 
0x400001856e0f, 0x13, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 data: 
'0x002f07'
        doing the usual alarm dance
X [5518934.500859799]-[5518934.528766984] Syscall 142 (   tcsetattr):(0x0, 0x1, 
0x6e08a0, 0x0, 0x0, 0x0) ret: 0x0 proc: 270 core: 0 vcore: 0 data: ''
X [5518934.514814507]-[5518934.545676053] Syscall 101 (       write):(0xf, 
0x400001856e0f, 0x13, 0x0, 0x0, 0x0) ret: 0x13 proc: 276 core: 5 vcore: 2 data: 
'0x002f0'
E [5518934.559674711]-[      0.000000000] Syscall  25 (      notify):(0x114, 
0xc, 0x7f7fff9fea80, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
        here's the kill 276 (0x114 == 276, 0xc = EV_POSIX_SIGNAL)
E [5518934.576631750]-[      0.000000000] Syscall  30 ( poke_ksched):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 data: ''
        VMM 2LS asking for more cores
X [5518934.559674711]-[5518934.591290179] Syscall  25 (      notify):(0x114, 
0xc, 0x7f7fff9fea80, 0x0, 0x0, 0x0) ret: 0x0 proc: 270 core: 0 vcore: 0 data: ''
        at this point, the notify has been sent to the VM
E [5518934.604827950]-[      0.000000000] Syscall  38 (vmm_poke_guest):(0x0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 2 vcore: 0 data: ''
        the VMM is running on two pcores.  we saw 5 poking, and then pcore 2 
just started up.  one of them got the POSIX_SIGNAL
                signal events are a UCQ + INDIR + SPAM_INDIR + IPI, etc.  so 
either PC 2 or 5 got the IPI.  probably PC 2 (VC 0), if it was noticed being up 
yet (racy)
X [5518934.576631750]-[5518934.604807318] Syscall  30 ( poke_ksched):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 5 vcore: 2 data: ''
        PC 5 returned from asking for more VCs
E [5518934.619485393]-[      0.000000000] Syscall  17 (     
waitpid):(0xffffffffffffffff, 0x7f7fff9fdd8c, 0x1, 0x0, 0x0, 0x0) ret: --- 
proc: 270 core: 0 vcore: 0 d'
        this is something bb/ash does from time to time.
E [5518934.646699446]-[      0.000000000] Syscall  12 (proc_destroy):(0x114, 
0xf, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 276 core: 5 vcore: 2 data: ''
        k, this is the response to the sig handler.  looks like PC 5 got the 
event afterall, decided to kill the process.
X [5518934.604827950]-[5518934.633158490] Syscall  38 (vmm_poke_guest):(0x0, 
0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 2 vcore: 0 data: ''
        oh, and PC 2 was still trying to IPI/inject to the guest
X [5518934.619485393]-[5518934.662916501] Syscall  17 (     
waitpid):(0xffffffffffffffff, 0x7f7fff9fdd8c, 0x1, 0x0, 0x0, 0x0) ret: 
0xffffffff proc: 270 core: 0 vco'
        bb/ash finished waiting
X [5518934.646699446]-[5518934.676639849] Syscall  12 (proc_destroy):(0x114, 
0xf, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 276 core: 5 vcore: 2 data: ''
        proc_destroy finished.  that means we should have closed all of our 
files, sent RKMs to all of our cores (including our self), and started to 
aborted all syscs
E [5518934.707148418]-[      0.000000000] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: --- proc: 272 core: 0 vcore: 0 data: ''
X [5518934.707148418]-[5518934.734343251] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 272 core: 0 vcore: 0 data: ''
        note that this is process 272, the parent of VMM.  it might have been 
woken up (not shown) then decided to yield.  that's all we see.  
        note it is on PC 0 - its an SCP (which is correct)
null eptp, gpc was 0xffff800009933ba0
        this is the beginning of panic prints, later we see them to be on core 
5, when our refcnt is 0.
        down below, we see more prints, interleaved with other syscalls, and 
eventually the Invalid Opcode trap (which is the invept)

        k, so what now?
                got some refcnt prints in there.  but they don't help much, 
since we know PC2 should have gotten a death message.
        refcnt was 13 (11 idle, which still seems like 1 too many).  2 more for 
running (owning and cur_proc on PC 5)
                11 is probably ok.  i read the blocked syscalls wrong (or timed 
it poorly). (caught two reads on eventfd)
                        5 syscalls, 4 alarms, parent, self
                
        PID 44 killing self, refcnt pre  destroy 14
                13 is good.  +1 for sys_proc_destroy
        PID 44 killing self, refcnt post destroy 9
                lost 5 refs when we closed FDs.  
        PID 44 killing self, refcnt fin  destroy 8
                dropped our ref.  8 out there still.

        what do we think each PC was doing?
                PC 5:

E [5518934.747888412]-[      0.000000000] Syscall 141 (   tcgetattr):(0x0, 
0x7f7fff9fea20, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 272 core: 0 vcore: 0 data: ''
null eptp, proc was 0xffff800007ec8d00
X [5518934.747888412]-[5518934.766173783] Syscall 141 (   tcgetattr):(0x0, 
0x7f7fff9fea20, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 272 core: 0 vcore: 0 data: ''
null eptp, pid was 276
E [5518934.784566228]-[      0.000000000] Syscall 142 (   tcsetattr):(0x0, 0x0, 
0x7f7fff9fea5c, 0x0, 0x0, 0x0) ret: --- proc: 272 core: 0 vcore: 0 data: ''
null eptp, PCPU was 0xffff800009933ba0
X [5518934.784566228]-[5518934.801564915] Syscall 142 (   tcsetattr):(0x0, 0x0, 
0x7f7fff9fea5c, 0x0, 0x0, 0x0) ret: 0x0 proc: 272 core: 0 vcore: 0 data: ''
null eptp, ref was  0
E [5518934.819973976]-[      0.000000000] Syscall 141 (   tcgetaEtr):(0x0, 
0x7f7fff9fdd20, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 27@ring Nanwan's Dungeon on 
Core 5 (I:
type 'help' for a list of commands.
 0x0, 0x0) ret: --- proc: 27T core: 0 vcore: 0 data: ''
mmands.
0  TRAP frame at 0xfffffff00H [5518934.819973976]-[5518934.851453553] SyscX rax 
 0xfffffff00010df28
8934.851453553] Syscall 141 (   tcgeta tr):(0x0, 0x7f7fff9fdd20,t rbx  
0xffff800009933ba0
 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 27  core: 0 vcore: 0 data: '0 rcx  
0x0000000000000001
'
  [5518934.877005159]-[   E rdx  0x000000000000c253
   0.000000000] Syscall 142 (   tcseta tr):(0x0, 0x1, 0x7f7fff9ft rbp  
0xfffffff00010df58
dd20, 0x0, 0x0, 0x0) ret: --- proc: 27  core: 0 vcore: 0 data: '0 rsi  
0x0000000000000780
'
  [5518934.877005159]-[551X rdi  0x000000000000000a
8934.900222775] Syscall 142 (   tcseta tr):(0x0, 0x1, 0x7f7fff9ft r8   
0xffffffffc7cc8d00
dd20, 0x0, 0x0, 0x0) ret: 0x0 proc: 27  core: 0 vcore: 0 data: '0 r9   
0x00000000000137e0
'
  [5518934.923459879]-[   E r10  0xffffffffc7cc8c60
   0.000000000] Syscall 117 (      get wd):(0x1000000053f0, 0xc0c r11  
0xffffffffc7cc8c20
, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 2 2 core: 0 vcore: 0 data: 7 r12  
0xffffffffc7b87bc8
''
  [5518934.923459879]-[551X r13  0xfffffff00010df28
8934.946767566] Syscall 117 (      get wd):(0x1000000053f0, 0xc0c r14  
0x00000000ffffffff
, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 2 2 core: 0 vcore: 0 data: 7 r15  
0x0000000000000000
''
  r5518934.970098414]-[   E trap 0x00000006 Invalid Opcode
0000000] Syscall 101 (       wr te):(0x1, 0x400000353000, 0x4, 0i gsbs 
0xffffffffc7ccec00
 0x4, 0x0, 0x0, 0x0) ret: --- proc: 27  core: 0 vcore: 0 data: '2 fsbs 
0x0000000000000000
/ $ '
  [5518934.970098414]-[551X err  0x--------00000000
8934.994264093] Syscall 101 (       wr te):(0x1, 0x400000353000,i rip  
0xffffffffc21360ab
 0x4, 0x0, 0x0, 0x0) ret: 0x4 proc: 27  core: 0 vcore: 0 data: '2 cs   
0x------------0008
/ $ '
  c5518935.017852056]-[   E flag 0x0000000000010003
   0.000000000] Syscall 104 (       fs at):(0x0, 0x7f7fff9fe600,t rsp  
0xfffffff00010df28
 0x0, 0x0, 0x0, 0x0) ret: --- proc: 27  core: 0 vcore: 0 data: '2 ss   
0x------------0010
'
  [5518935.017852056]-[551XOS(Core 5)> 7852056]-[5518935.041055433] Syscall 104 
(       fsRat):(0x0, 0xtat):(0x0, 0x7f7fff9fe600, 0x0, 0x0, 0x0, 0x0) ret: 0x0 
proc7
2 core: 0 vcore: 0 data: ''
2 [5518935.059972424]-[      0.000000000]E [5518935.059972424]-[      
0.000000000] Syscall 101 (       write):(0xe, 0x4000019a8d80, 0x4, 0x0, 0x0, 
0x0) ret: --- pr'

X [5518935.059972424]-[5518935.078783031] Syscall 101 (       wrXte):(0xe, 
0x4000019a8d80, 0x4, 0x0, 0x0, 0x0) ret: 0x4 proc: 27@Core 5)> ite):(0xe, 
0x4000019a8d80'
E [5518935.101344461]-[      0.000000000] Syscall 100 (        read):(0x0, 
0x7f7fff9fea11, 0x1, 0x0, 0x0, 0x0) ret: --- proc: 27
2 core: 0 vcore: 0 data: ''
2 c5518935.116454475]-[      0.000000000]E [5518935.116454475]-[      
0.000000000] Syscall 100 (        read):(0xa, 0x4000019a8d80, 0x200, 0x0, 0x0, 
0x0) ret: --- '
E [5518935.135091954]-[      0.000000000] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: --- proc: 272 core: 0 v
core: 0 data: ''
co[5518935.135091954]-[5518935X [5518935.135091954]-[5518935.149246089] Syscall 
 13 (  proc_yield):(0x0, 0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 272 core: 0 
vcore:'
E [5518935.165805826]-[      0.000000000] Syscall  31 (  
abort_sysc):(0x7f7fff9fe760, 0x0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 27
1 core: 0 vcore: 0 data: ''
1 [5518935.165805826]-[5518935.180912680]X 
[5518935.165805826]-[5518935.180912680] Syscall  31 (  
abort_sysc):(0x7f7fff9fe760, 0x0, 0x0, 0x0, 0x0, 0x0) ret: 0x1 pr'
E [5518935.199376625]-[      0.000000000] Syscall 101 (       write):(0x2, 
0x400000558000, 0xa, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: 
'bash-4.3$'

bash-4.3$ ROS(Core 5)> ROS(Core 5)> X [5518935.199376625]-[5518935.216480420] 
Syscall 101 (       write):(0x2, 0x400000558000, 0xa, 0x0, 0x0, 0x0) ret: 0xa 
proc: 2'
E [5518935.234680557]-[      0.000000000] Syscall 100 (        read):(0x0, 
0x7f7fff9fdd2f, 0x1, 0x0, 0x0, 0x0) ret: --- proc: 27
0 core: 0 vcore: 0 data: ''
0 [5518935.234680557]-[5518935.249791870]X 
[5518935.234680557]-[5518935.249791870] Syscall 100 (        read):(0x0, 
0x7f7fff9fdd2f, 0x1, 0x0, 0x0, 0x0) ret: 0x1 pr'
E [5518935.268608194]-[      0.000000000] Syscall 100 (        read):(0x4, 
0x7f7fff9fe98f, 0x1, 0x0, 0x0, 0x0) ret: --- proc: 27
1 core: 0 vcore: 0 data: ''
1 [5518935.268608194]-[5518935.283717811]X 
[5518935.268608194]-[5518935.283717811] Syscall 100 (        read):(0x4, 
0x7f7fff9fe98f, 0x1, 0x0, 0x0, 0x0) ret: 0xffff'
E [5518935.303484443]-[      0.000000000] Syscall 101 (       write):(0x2, 
0x400000558000, 0x1, 0x0, 0x0, 0x0) ret: --- proc: 27
0 core: 0 vcore: 0 data: '\012'
0 core: 0 vcore: 0 data: '\012'
ROS(Core 5)> 
X [5518935.303484443]-[5518935.323968799] Syscall 101 (       write):(0x2, 
0x400000558000, 0x1, 0x0, 0x0, 0x0) ret: 0x1 proc: 270 core: 0 vcore: 0 data: 
'\012'
E [5518935.338810560]-[      0.000000000] Syscall 142 (   tcsetattr):(0x0, 0x1, 
0x6e08a0, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
X [5518935.338810560]-[5518935.352773778] Syscall 142 (   tcsetattr):(0x0, 0x1, 
0x6e08a0, 0x0, 0x0, 0x0) ret: 0x0 proc: 270 core: 0 vcore: 0 data: ''
E [5518935.366748239]-[      0.000000000] Syscall 100 (        read):(0x6, 
0x1000000187a8, 0x10, 0x0, 0x0, 0x0) ret: --- proc: 271 core: 0 vcore: 0 data: 
''
X [5518935.366748239]-[5518935.381328778] Syscall 100 (        read):(0x6, 
0x1000000187a8, 0x10, 0x0, 0x0, 0x0) ret: 0xffffffffffffffff proc: 271 core: 0 
vcore: 0 '
E [5518935.397648877]-[      0.000000000] Syscall  17 (     
waitpid):(0xffffffffffffffff, 0x7f7fff9fdd8c, 0x1, 0x0, 0x0, 0x0) ret: --- 
proc: 270 core: 0 vcore: 0 d'
X [5518935.397648877]-[5518935.413874984] Syscall  17 (     
waitpid):(0xffffffffffffffff, 0x7f7fff9fdd8c, 0x1, 0x0, 0x0, 0x0) ret: 
0xffffffff proc: 270 core: 0 vco'
E [5518935.430723542]-[      0.000000000] Syscall 141 (   tcgetattr):(0x0, 
0x7f7fff9fdd20, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
X [5518935.430723542]-[5518935.445212738] Syscall 141 (   tcgetattr):(0x0, 
0x7f7fff9fdd20, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 270 core: 0 vcore: 0 data: ''
E [5518935.459695904]-[      0.000000000] Syscall 142 (   tcsetattr):(0x0, 0x1, 
0x7f7fff9fdd20, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
X [5518935.459695904]-[5518935.474178990] Syscall 142 (   tcsetattr):(0x0, 0x1, 
0x7f7fff9fdd20, 0x0, 0x0, 0x0) ret: 0x0 proc: 270 core: 0 vcore: 0 data: ''
E [5518935.488667105]-[      0.000000000] Syscall 100 (        read):(0xd, 
0x10000000da91, 0x3ff7, 0x0, 0x0, 0x0) ret: --- proc: 271 core: 0 vcore: 0 
data: ''
X [5518935.488667105]-[5518935.503416746] Syscall 100 (        read):(0xd, 
0x10000000da91, 0x3ff7, 0x0, 0x0, 0x0) ret: 0x4 proc: 271 core: 0 vcore: 0 
data: '/ $ '
E [5518935.518959357]-[      0.000000000] Syscall 101 (       write):(0x2, 
0x400000558000, 0xa, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: 
'bash-4.3$'
bash-4.3$ X [5518935.518959357]-[5518935.535475271] Syscall 101 (       
write):(0x2, 0x400000558000, 0xa, 0x0, 0x0, 0x0) ret: 0xa proc: 270 core: 0 
vcore: 0 data: '
E [5518935.551424483]-[      0.000000000] Syscall 100 (        read):(0x0, 
0x7f7fff9fdd2f, 0x1, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
X [5518935.551424483]-[5518935.565917718] Syscall 100 (        read):(0x0, 
0x7f7fff9fdd2f, 0x1, 0x0, 0x0, 0x0) ret: 0x1 proc: 270 core: 0 vcore: 0 data: 
'\012'
E [5518935.580752716]-[      0.000000000] Syscall 101 (       write):(0x2, 
0x400000558000, 0x1, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: 
'\012'

X [5518935.580752716]-[5518935.596200654] Syscall 101 (       write):(0x2, 
0x400000558000, 0x1, 0x0, 0x0, 0x0) ret: 0x1 proc: 270 core: 0 vcore: 0 data: 
'\012'
E [5518935.611036352]-[      0.000000000] Syscall 142 (   tcsetattr):(0x0, 0x1, 
0x6e08a0, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
X [5518935.611036352]-[5518935.625000363] Syscall 142 (   tcsetattr):(0x0, 0x1, 
0x6e08a0, 0x0, 0x0, 0x0) ret: 0x0 proc: 270 core: 0 vcore: 0 data: ''
E [5518935.638974329]-[      0.000000000] Syscall  17 (     
waitpid):(0xffffffffffffffff, 0x7f7fff9fdd8c, 0x1, 0x0, 0x0, 0x0) ret: --- 
proc: 270 core: 0 vcore: 0 d'
X [5518935.638974329]-[5518935.655195625] Syscall  17 (     
waitpid):(0xffffffffffffffff, 0x7f7fff9fdd8c, 0x1, 0x0, 0x0, 0x0) ret: 
0xffffffff proc: 270 core: 0 vco'
E [5518935.672036076]-[      0.000000000] Syscall 141 (   tcgetattr):(0x0, 
0x7f7fff9fdd20, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
X [5518935.672036076]-[5518935.686515827] Syscall 141 (   tcgetattr):(0x0, 
0x7f7fff9fdd20, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 270 core: 0 vcore: 0 data: ''
E [5518935.701011106]-[      0.000000000] Syscall 142 (   tcsetattr):(0x0, 0x1, 
0x7f7fff9fdd20, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
X [5518935.701011106]-[5518935.715495080] Syscall 142 (   tcsetattr):(0x0, 0x1, 
0x7f7fff9fdd20, 0x0, 0x0, 0x0) ret: 0x0 proc: 270 core: 0 vcore: 0 data: ''
E [5518935.730014422]-[      0.000000000] Syscall 101 (       write):(0x6, 
0x1000000188c8, 0x40, 0x0, 0x0, 0x0) ret: --- proc: 271 core: 0 vcore: 0 data: 
'}\000YW\'
X [5518935.730014422]-[5518935.761607260] Syscall 101 (       write):(0x6, 
0x1000000188c8, 0x40, 0x0, 0x0, 0x0) ret: 0x40 proc: 271 core: 0 vcore: 0 data: 
'}\000YW'
E [5518935.793312557]-[      0.000000000] Syscall 101 (       write):(0x2, 
0x400000558000, 0xa, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: 
'bash-4.3$'
bash-4.3$ X [5518935.793312557]-[5518935.809826475] Syscall 101 (       
write):(0x2, 0x400000558000, 0xa, 0x0, 0x0, 0x0) ret: 0xa proc: 270 core: 0 
vcore: 0 data: '
E [5518935.825766138]-[      0.000000000] Syscall 100 (        read):(0x0, 
0x7f7fff9fdd2f, 0x1, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
X [5518935.825766138]-[5518935.840251689] Syscall 100 (        read):(0x0, 
0x7f7fff9fdd2f, 0x1, 0x0, 0x0, 0x0) ret: 0x1 proc: 270 core: 0 vcore: 0 data: 
'\012'
E [5518935.855087191]-[      0.000000000] Syscall 101 (       write):(0x2, 
0x400000558000, 0x1, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: 
'\012'

X [5518935.855087191]-[5518935.870532152] Syscall 101 (       write):(0x2, 
0x400000558000, 0x1, 0x0, 0x0, 0x0) ret: 0x1 proc: 270 core: 0 vcore: 0 data: 
'\012'
E [5518935.885367421]-[      0.000000000] Syscall 104 (       fstat):(0x4, 
0x7f7fff9fe8b0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 271 core: 0 vcore: 0 data: ''
X [5518935.885367421]-[5518935.899855854] Syscall 104 (       fstat):(0x4, 
0x7f7fff9fe8b0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 271 core: 0 vcore: 0 data: ''
E [5518935.914346588]-[      0.000000000] Syscall 104 (       fstat):(0x6, 
0x7f7fff9fe8b0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 271 core: 0 vcore: 0 data: ''
X [5518935.914346588]-[5518935.928839820] Syscall 104 (       fstat):(0x6, 
0x7f7fff9fe8b0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 271 core: 0 vcore: 0 data: ''
E [5518935.943337892]-[      0.000000000] Syscall 104 (       fstat):(0xd, 
0x7f7fff9fe8b0, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 271 core: 0 vcore: 0 data: ''
X [5518935.943337892]-[5518935.957827629] Syscall 104 (       fstat):(0xd, 
0x7f7fff9fe8b0, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 271 core: 0 vcore: 0 data: ''
E [5518935.972310629]-[      0.000000000] Syscall 142 (   tcsetattr):(0x0, 0x1, 
0x6e08a0, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
X [5518935.972310629]-[5518935.986267744] Syscall 142 (   tcsetattr):(0x0, 0x1, 
0x6e08a0, 0x0, 0x0, 0x0) ret: 0x0 proc: 270 core: 0 vcore: 0 data: ''
E [5518936.000254923]-[      0.000000000] Syscall   2 (       
block):(0x5f8df80, 0x100000002430, 0x7f7fff9fe760, 0x0, 0x7f7fff9fe760, 0x0) 
ret: --- proc: 271 core:'
E [5518936.017614094]-[      0.000000000] Syscall  17 (     
waitpid):(0xffffffffffffffff, 0x7f7fff9fdd8c, 0x1, 0x0, 0x0, 0x0) ret: --- 
proc: 270 core: 0 vcore: 0 d'
X [5518936.017614094]-[5518936.033850018] Syscall  17 (     
waitpid):(0xffffffffffffffff, 0x7f7fff9fdd8c, 0x1, 0x0, 0x0, 0x0) ret: 
0xffffffff proc: 270 core: 0 vco'
E [5518936.050696334]-[      0.000000000] Syscall 141 (   tcgetattr):(0x0, 
0x7f7fff9fdd20, 0x0, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
X [5518936.050696334]-[5518936.065183237] Syscall 141 (   tcgetattr):(0x0, 
0x7f7fff9fdd20, 0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 270 core: 0 vcore: 0 data: ''
E [5518936.079670791]-[      0.000000000] Syscall 142 (   tcsetattr):(0x0, 0x1, 
0x7f7fff9fdd20, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
X [5518936.079670791]-[5518936.094152716] Syscall 142 (   tcsetattr):(0x0, 0x1, 
0x7f7fff9fdd20, 0x0, 0x0, 0x0) ret: 0x0 proc: 270 core: 0 vcore: 0 data: ''
E [5518936.108641989]-[      0.000000000] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: --- proc: 271 core: 0 vcore: 0 data: ''
X [5518936.108641989]-[5518936.122171235] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 271 core: 0 vcore: 0 data: ''
E [5518936.135710485]-[      0.000000000] Syscall 101 (       write):(0x2, 
0x400000558000, 0xa, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: 
'bash-4.3$'
bash-4.3$ X [5518936.135710485]-[5518936.152201610] Syscall 101 (       
write):(0x2, 0x400000558000, 0xa, 0x0, 0x0, 0x0) ret: 0xa proc: 270 core: 0 
vcore: 0 data: '
E [5518936.168133443]-[      0.000000000] Syscall 100 (        read):(0x0, 
0x7f7fff9fdd2f, 0x1, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
E [5518936.182623493]-[      0.000000000] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: --- proc: 270 core: 0 vcore: 0 data: ''
X [5518936.182623493]-[5518936.196158224] Syscall  13 (  proc_yield):(0x0, 0x0, 
0x0, 0x0, 0x0, 0x0) ret: 0x0 proc: 270 core: 0 vcore: 0 data: ''

ROS(Core 5)> 





refcnts:

Program name: vmrunkernel
PID: 274
PPID: 270
State: WAITING (0x0000000000000008)
        Is an MCP
Refcnt: 11
Open Files:
        FD: 00, Chan flags: 0x0000000000000001, pathname: #pipe./data1, ref: 8, 
Dev: pipe, Devinfo: Qdata1, ID 5, tapped, rq len 0, wq len 0
        FD: 01, Chan flags: 0x0000000000000001, pathname: #pipe./data1, ref: 8, 
Dev: pipe, Devinfo: Qdata1, ID 5, tapped, rq len 0, wq len 0
        FD: 02, Chan flags: 0x0000000000000001, pathname: #pipe./data1, ref: 8, 
Dev: pipe, Devinfo: Qdata1, ID 5, tapped, rq len 0, wq len 0
        FD: 03, Chan flags: 0x0000000000080001, pathname: #alarm./clone, ref: 
1, Dev: alarm, Devinfo: Id 0, tapped, expires 12345, period 0, count 0
        FD: 04, Chan flags: 0x0000000000080001, pathname: #alarm./a0/timer, 
ref: 2, Dev: alarm, Devinfo: Id 0, tapped, expires 12345, period 0, count 0
        FD: 05, Chan flags: 0x0000000000008001, pathname: #pipe., ref: 1, Dev: 
pipe, Devinfo: Qdir, ID 12
        FD: 06, Chan flags: 0x0000000000080001, pathname: #alarm./clone, ref: 
1, Dev: alarm, Devinfo: Id 1, tapped, expires 13041036611853105, period 0, 
count 0
        FD: 07, Chan flags: 0x0000000000080001, pathname: #alarm./a1/timer, 
ref: 2, Dev: alarm, Devinfo: Id 1, tapped, expires 13041036611853105, period 0, 
count 0
        FD: 08, Chan flags: 0x0000000000008001, pathname: #pipe., ref: 3, Dev: 
pipe, Devinfo: Qdir, ID 5
        FD: 09, Chan flags: 0x0000000000000801, pathname: #pipe./data, ref: 1, 
Dev: pipe, Devinfo: Qdata0, ID 12, untapped, rq len 0, wq len 2804
        FD: 10, Chan flags: 0x0000000000080001, pathname: #alarm./clone, ref: 
1, Dev: alarm, Devinfo: Id 2, tapped, expires 13041002838964331, period 0, 
count 0
        FD: 11, Chan flags: 0x0000000000100000, pathname: #srv./snoop-274, ref: 
1, Dev: srv, Devinfo: qid.path: 0x0000000000000002, qid.type: 01
        FD: 12, Chan flags: 0x0000000000008001, pathname: #pipe., ref: 3, Dev: 
pipe, Devinfo: Qdir, ID 6
        FD: 13, Chan flags: 0x0000000000080001, pathname: #alarm./clone, ref: 
1, Dev: alarm, Devinfo: Id 3, tapped, expires 13041036851838980, period 0, 
count 0
        FD: 14, Chan flags: 0x0000000000000801, pathname: /net/tcp/2/data, ref: 
2, Dev: ip, Devinfo: Qdata, tapped, proto tcp, conv idx 2, rq len 0, wq len 0
        FD: 15, Chan flags: 0x0000000000080001, pathname: #alarm./a2/timer, 
ref: 2, Dev: alarm, Devinfo: Id 2, tapped, expires 13041002838964331, period 0, 
count 0
        FD: 16, Chan flags: 0x0000000000080001, pathname: #alarm./a3/timer, 
ref: 2, Dev: alarm, Devinfo: Id 3, tapped, expires 13041036851838980, period 0, 
count 0
        FD: 17, Chan flags: 0x0000000000000001, pathname: #eventf/efd, ref: 1, 
Dev: eventfd, Devinfo: QID type efd, flags 0x0000000000000000, counter 
0x000000000003
        FD: 18, Chan flags: 0x0000000000000001, pathname: #eventf/efd, ref: 2, 
Dev: eventfd, Devinfo: QID type efd, flags 0x0000000000000000, counter 
0x000000000000
        FD: 19, Chan flags: 0x0000000000000001, pathname: #eventf/efd, ref: 1, 
Dev: eventfd, Devinfo: QID type efd, flags 0x0000000000000000, counter 
0x000000000000
        FD: 20, Chan flags: 0x0000000000000001, pathname: #eventf/efd, ref: 2, 
Dev: eventfd, Devinfo: QID type efd, flags 0x0000000000000000, counter 
0x000000000000
Children: (PID (struct proc *))

which refs should we have?
        4 syscalls: (refs)
                WRONG - saw 5 syscalls
        -----------
        read on FD 0
        read on FD 18
        block 100 ms
        block 30 sec

        4 #alarms (ID 0, 1, 2, 3)

        parent
        ksched

        find_first_kid in #cons


Who's doing the decrefs?

76 killing self, refcnt pre  destroy 14
        13 + 1 for proc_destroy
        now we called proc_destroy.


Core 1 about to decref PID 276's ref, was 13
        core 1, 13->12


Core 0 about to decref PID 276's ref, was 14
        core 0 did it first, 14 -> 13
        
        one was the alarm, the other the MCP ksched
        destroy was closing FDs, should be core 1.  so core 0 must have gotten 
an RKM
                k, so the ksched saw us.  since it's an RKM and on core 0, it 
looks like a timer tick went off while we were dying.
Stack Backtrace on Core 1:
#01 [<0xffffffffc204f940>] in proc_decref
#02 [<0xffffffffc20fc540>] in alarm_release
#03 [<0xffffffffc203cf58>] in chan_release
#04 [<0xffffffffc203c8ab>] in kref_put
#05 [<0xffffffffc200a5c5>] in tap_min_release
#06 [<0xffffffffc205f4ab>] in kref_put
#07 [<0xffffffffc2063e32>] in close_fdt
#08 [<0xffffffffc2051451>] in proc_destroy
#09 [<0xffffffffc205a27c>] in sys_proc_destroy
#10 [<0xffffffffc205c801>] in syscall
#11 [<0xffffffffc205caf8>] in run_local_syscall
#12 [<0xffffffffc2131b7a>] in sysenter_callwrapper

        1 alarm

# manually constructed this, from core 0 (interleaved
Stack Backtrace on Core 0:
#01 [<0xffffffffc204f940>] in proc_decref
#02 [<0xffffffffc2054892>] in __run_mcp_ksched
#03 [<0xffffffffc2003995>] in poke
#04 [<0xffffffffc2054e12>] in run_scheduler
#05 [<0xffffffffc2054e59>] in __ksched_tick
#06 [<0xffffffffc20000c4>] in __run_awaiter
#07 [<0xffffffffc205d8ef>] in process_routine_kmsg
#08 [<0xffffffffc2056e63>] in __smp_idle

Core 1 about to decref PID 276's ref, was 12
Stack Backtrace on Core 1:
#01 [<0xffffffffc204f940>] in proc_decref
#02 [<0xffffffffc20fc540>] in alarm_release
#03 [<0xffffffffc203cf58>] in chan_release
#04 [<0xffffffffc203c8ab>] in kref_put
#05 [<0xffffffffc200a5c5>] in tap_min_release
#06 [<0xffffffffc205f4ab>] in kref_put
#07 [<0xffffffffc2063e32>] in close_fdt
#08 [<0xffffffffc2051451>] in proc_destroy
#09 [<0xffffffffc205a27c>] in sys_proc_destroy
#10 [<0xffffffffc205c801>] in syscall
#11 [<0xffffffffc205caf8>] in run_local_syscall
#12 [<0xffffffffc2131b7a>] in sysenter_callwrapper

2 alarms

Core 1 about to decref PID 276's ref, was 11
Stack Backtrace on Core 1:
#01 [<0xffffffffc204f940>] in proc_decref
#02 [<0xffffffffc20fc540>] in alarm_release
#03 [<0xffffffffc203cf58>] in chan_release
#04 [<0xffffffffc203c8ab>] in kref_put
#05 [<0xffffffffc200a5c5>] in tap_min_release
        tap - the alarm was tapped.  closing the FD closes the tap, which 
closes its chan.  since closing the tap happens after the FD chan, tapped FDs 
should close like this.
#06 [<0xffffffffc205f4ab>] in kref_put
#07 [<0xffffffffc2063e32>] in close_fdt
#08 [<0xffffffffc2051451>] in proc_destroy
#09 [<0xffffffffc205a27c>] in sys_proc_destroy
#10 [<0xffffffffc205c801>] in syscall
#11 [<0xffffffffc205caf8>] in run_local_syscall
#12 [<0xffffffffc2131b7a>] in sysenter_callwrapper

3 alarms

Core 1 about to decref PID 276's ref, was 10
Stack Backtrace on Core 1:
#01 [<0xffffffffc204f940>] in proc_decref
#02 [<0xffffffffc20fc540>] in alarm_release
#03 [<0xffffffffc203cf58>] in chan_release
#04 [<0xffffffffc203c8ab>] in kref_put
#05 [<0xffffffffc200a5c5>] in tap_min_release
#06 [<0xffffffffc205f4ab>] in kref_put
#07 [<0xffffffffc2063e32>] in close_fdt
#08 [<0xffffffffc2051451>] in proc_destroy
#09 [<0xffffffffc205a27c>] in sys_proc_destroy
#10 [<0xffffffffc205c801>] in syscall
#11 [<0xffffffffc205caf8>] in run_local_syscall
#12 [<0xffffffffc2131b7a>] in sysenter_callwrapper

4 alarms closed.  plus ksched.  down to 9 (from 14).


Core 1 about to decref PID 276's ref, was 9
Stack Backtrace on Core 1:
#01 [<0xffffffffc204f940>] in proc_decref
#02 [<0xffffffffc2051474>] in proc_destroy
#03 [<0xffffffffc205a27c>] in sys_proc_destroy
#04 [<0xffffffffc205c801>] in syscall
#05 [<0xffffffffc205caf8>] in run_local_syscall
#06 [<0xffffffffc2131b7a>] in sysenter_callwrapper

        wtf is this decref?
                looks like sched_proc_destroy.  (that #2 in proc_destroy is 
right after __sched_proc_destroy, before signal_parent)
                yeah, that calls proc_decref right at the end.
                but we already closed the ksched ref...


        looks like we have two decrefs in the ksched.  the comments say that 
shouldn't happen, but it looks like it does.
                let's see the others real quick
                for the race, you'd need to proc_destroy yourself as a running 
MCP
                        why does the VM trigger it every time?  is it because 
of all of the FDs open?  (the race is from setting DYING to locking the ksched, 
which is increased by the number and complexity of open files (more tapped 
files, etc)
                                - that makes it more likely
                                since this actually isn't it, it might just be 
a red herring.
        wait - this is OK.  that decref we saw was from the MCP ksched, but the 
ref wasn't the ksched's ref, it was the one we grabbed 10 lines up
                which wasn't printed.
                we didn't see the incref for that though.

PID 276 killing self, refcnt post destroy 8


Core 1 about to decref PID 276's ref, was 8
Stack Backtrace on Core 1:
#01 [<0xffffffffc204f940>] in proc_decref
#02 [<0xffffffffc205a291>] in sys_proc_destroy
#03 [<0xffffffffc205c801>] in syscall
#04 [<0xffffffffc205caf8>] in run_local_syscall
#05 [<0xffffffffc2131b7a>] in sysenter_callwrapper
        this was the ref in sys_proc_destroy

PID 276 killing self, refcnt fin  destroy 7
PID 276 killing self, refcnt wait destroy 7
PID 276 killing self, refcnt wait destroy 7
PID 276 killing self, refcnt wait destroy 7
PID 276 killing self, refcnt wait destroy 7
PID 276 killing self, refcnt wait destroy 7

        2 for my core, 5 syscalls, parent.  whoops.


Core 1 about to decref PID 276's ref, was 7
Stack Backtrace on Core 1:
#01 [<0xffffffffc204f940>] in proc_decref
#02 [<0xffffffffc205d8ef>] in process_routine_kmsg
#03 [<0xffffffffc2051735>] in proc_restartcore

        in a kmsg.  should be __death, clearing owning_proc

Core 1 about to decref PID 276's ref, was 6
Stack Backtrace on Core 1:
#01 [<0xffffffffc204f940>] in proc_decref
#02 [<0xffffffffc2018e89>] in restart_kthread
#03 [<0xffffffffc2018eef>] in __launch_kthread
#04 [<0xffffffffc205d8ef>] in process_routine_kmsg
#05 [<0xffffffffc2051735>] in proc_restartcore

        this looks like an aborted syscall (5 of them)

Core 1 about to decref PID 276's ref, was 5
Stack Backtrace on Core 1:
#01 [<0xffffffffc204f940>] in proc_decref
#02 [<0xffffffffc2018e89>] in restart_kthread
#03 [<0xffffffffc2018eef>] in __launch_kthread
#04 [<0xffffffffc205d8ef>] in process_routine_kmsg
#05 [<0xffffffffc2051735>] in proc_restartcore


Core 1 about to decref PID 276's ref, was 4
Stack Backtrace on Core 1:
#01 [<0xffffffffc204f940>] in proc_decref
#02 [<0xffffffffc2018e89>] in restart_kthread
#03 [<0xffffffffc2018eef>] in __launch_kthread
#04 [<0xffffffffc205d8ef>] in process_routine_kmsg
#05 [<0xffffffffc2051735>] in proc_restartcore


Core 1 about to decref PID 276's ref, was 3
Stack Backtrace on Core 1:
#01 [<0xffffffffc204f940>] in proc_decref
#02 [<0xffffffffc2018e89>] in restart_kthread
#03 [<0xffffffffc2018eef>] in __launch_kthread
#04 [<0xffffffffc205d8ef>] in process_routine_kmsg
#05 [<0xffffffffc2051735>] in proc_restartcore


Core 1 about to decref PID 276's ref, was 2
Stack Backtrace on Core 1:
#01 [<0xffffffffc204f940>] in proc_decref
#02 [<0xffffffffc2018e38>] in restart_kthread
#03 [<0xffffffffc2018eef>] in __launch_kthread
#04 [<0xffffffffc205d8ef>] in process_routine_kmsg
#05 [<0xffffffffc2051735>] in proc_restartcore


Core 1 about to decref PID 276's ref, was 1
Stack Backtrace on Core 1:
#01 [<0xffffffffc204f940>] in proc_decref
#02 [<0xffffffffc2050134>] in __proc_disown_child
#03 [<0xffffffffc2059bba>] in try_wait.isra.4
#04 [<0xffffffffc2059c33>] in try_wait_any.isra.6
#05 [<0xffffffffc2059d9b>] in sys_waitpid
#06 [<0xffffffffc205c801>] in syscall
#07 [<0xffffffffc205caf8>] in run_local_syscall
#08 [<0xffffffffc2131b7a>] in sysenter_callwrapper

        this is the parent.


ok, let's try again:

PID 272 core 6 killing self, refcnt pre  destroy 14
bash-4.3$ 
        14 refs, sounds right normally.  5 syscalls, 4 alarms, parent, ksched, 
2 for core 6, one for sys_proc_destroy.
        assuming the MCP sched isn't running.

Core 6 about to decref PID 272's ref, was 14
Stack Backtrace on Core 6:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc20fc5d0>] in alarm_release
#03 [<0xffffffffc203cf58>] in chan_release
#04 [<0xffffffffc203c8ab>] in kref_put
#05 [<0xffffffffc200a5c5>] in tap_min_release
#06 [<0xffffffffc205f53b>] in kref_put
#07 [<0xffffffffc2063ec2>] in close_fdt
#08 [<0xffffffffc20514d1>] in proc_destroy
#09 [<0xffffffffc205a2fc>] in sys_proc_destroy
#10 [<0xffffffffc205c891>] in syscall
#11 [<0xffffffffc205cb88>] in run_local_syscall
#12 [<0xffffffffc2131c0a>] in sysenter_callwrapper
        1 alarm

Core 0 about to incref PID 272's ref, was 13
Stack Backtrace on Core 0:
#01 [<0xffffffffc204f01f>] in proc_incref
#02 [<0xffffffffc2054613>] in __run_mcp_ksched
#03 [<0xffffffffc2003995>] in poke
#04 [<0xffffffffc2054e92>] in run_scheduler
#05 [<0xffffffffc2054ed9>] in __ksched_tick
#06 [<0xffffffffc20000c4>] in __run_awaiter
#07 [<0xffffffffc205d97f>] in process_routine_kmsg
#08 [<0xffffffffc2056ee3>] in __smp_idle
        incref in mcp ksched.  back up to 14.


Core 6 about to decref PID 272's ref, was 14
Stack Backtrace on Core 6:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc20fc5d0>] in alarm_release
#03 [<0xffffffffc203cf58>] in chan_release
#04 [<0xffffffffc203c8ab>] in kref_put
#05 [<0xffffffffc200a5c5>] in tap_min_release
#06 [<0xffffffffc205f53b>] in kref_put
#07 [<0xffffffffc2063ec2>] in close_fdt
#08 [<0xffffffffc20514d1>] in proc_destroy
#09 [<0xffffffffc205a2fc>] in sys_proc_destroy
#10 [<0xffffffffc205c891>] in syscall
#11 [<0xffffffffc205cb88>] in run_local_syscall
#12 [<0xffffffffc2131c0a>] in sysenter_callwrapper
        2 alarm


Core 8 about to decref PID 272's ref, was 13
Stack Backtrace on Core 8:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc212ea07>] in __abandon_core
#03 [<0xffffffffc2051812>] in proc_restartcore

        core 8 decref, cur_proc! (unexpected)
 
Core 6 about to decref PID 272's ref, was 12
Stack Backtrace on Core 6:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc20fc5d0>] in alarm_release
#03 [<0xffffffffc203cf58>] in chan_release
#04 [<0xffffffffc203c8ab>] in kref_put
#05 [<0xffffffffc200a5c5>] in tap_min_release
#06 [<0xffffffffc205f53b>] in kref_put
#07 [<0xffffffffc2063ec2>] in close_fdt
#08 [<0xffffffffc20514d1>] in proc_destroy
#09 [<0xffffffffc205a2fc>] in sys_proc_destroy
#10 [<0xffffffffc205c891>] in syscall
#11 [<0xffffffffc205cb88>] in run_local_syscall
#12 [<0xffffffffc2131c0a>] in sysenter_callwrapper
        3 alarm

Core 0 about to decref PID 272's ref, was 11
Stack Backtrace on Core 0:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc2054912>] in __run_mcp_ksched
#03 [<0xffffffffc2003995>] in poke
#04 [<0xffffffffc2054e92>] in run_scheduler
#05 [<0xffffffffc2054ed9>] in __ksched_tick
#06 [<0xffffffffc20000c4>] in __run_awaiter
#07 [<0xffffffffc205d97f>] in process_routine_kmsg
#08 [<0xffffffffc2056ee3>] in __smp_idle
        mcp ksched, decreffing the incref from above

Core 6 about to decref PID 272's ref, was 10
Stack Backtrace on Core 6:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc20fc5d0>] in alarm_release
#03 [<0xffffffffc203cf58>] in chan_release
#04 [<0xffffffffc203c8ab>] in kref_put
#05 [<0xffffffffc200a5c5>] in tap_min_release
#06 [<0xffffffffc205f53b>] in kref_put
#07 [<0xffffffffc2063ec2>] in close_fdt
#08 [<0xffffffffc20514d1>] in proc_destroy
#09 [<0xffffffffc205a2fc>] in sys_proc_destroy
#10 [<0xffffffffc205c891>] in syscall
#11 [<0xffffffffc205cb88>] in run_local_syscall
#12 [<0xffffffffc2131c0a>] in sysenter_callwrapper
        4 alarm


Core 6 about to decref PID 272's ref, was 9
Stack Backtrace on Core 6:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc20514f4>] in proc_destroy
#03 [<0xffffffffc205a2fc>] in sys_proc_destroy
#04 [<0xffffffffc205c891>] in syscall
#05 [<0xffffffffc205cb88>] in run_local_syscall
#06 [<0xffffffffc2131c0a>] in sysenter_callwrapper
PID 272 killing self, refcnt post destroy 8
        and of __schec_proc_destroy, this is the cradle-to-grave ref

Core 6 about to decref PID 272's ref, was 8
Stack Backtrace on Core 6:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc205a311>] in sys_proc_destroy
#03 [<0xffffffffc205c891>] in syscall
#04 [<0xffffffffc205cb88>] in run_local_syscall
#05 [<0xffffffffc2131c0a>] in sysenter_callwrapper
        sys_proc_destroy's ref
PID 272 killing self, refcnt fin  destroy 7
PID 272 killing self, refcnt wait destroy 7
PID 272 killing self, refcnt wait destroy 7
PID 272 killing self, refcnt wait destroy 7
PID 272 killing self, refcnt wait destroy 7
PID 272 killing self, refcnt wait destroy 7


Core 6 about to decref PID 272's ref, was 7
Stack Backtrace on Core 6:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc205d97f>] in process_routine_kmsg
#03 [<0xffffffffc20517b5>] in proc_restartcore
        __death / clear_owning_proc (1 ref for core 6)


Core 6 about to decref PID 272's ref, was 6
Stack Backtrace on Core 6:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc2018e89>] in restart_kthread
#03 [<0xffffffffc2018eef>] in __launch_kthread
#04 [<0xffffffffc205d97f>] in process_routine_kmsg
#05 [<0xffffffffc20517b5>] in proc_restartcore


Core 6 about to decref PID 272's ref, was 5
Stack Backtrace on Core 6:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc2018e89>] in restart_kthread
#03 [<0xffffffffc2018eef>] in __launch_kthread
#04 [<0xffffffffc205d97f>] in process_routine_kmsg
#05 [<0xffffffffc20517b5>] in proc_restartcore


Core 6 about to decref PID 272's ref, was 4
Stack Backtrace on Core 6:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc2018e89>] in restart_kthread
#03 [<0xffffffffc2018eef>] in __launch_kthread
#04 [<0xffffffffc205d97f>] in process_routine_kmsg
#05 [<0xffffffffc20517b5>] in proc_restartcore


Core 6 about to decref PID 272's ref, was 3
Stack Backtrace on Core 6:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc2018e89>] in restart_kthread
#03 [<0xffffffffc2018eef>] in __launch_kthread
#04 [<0xffffffffc205d97f>] in process_routine_kmsg
#05 [<0xffffffffc20517b5>] in proc_restartcore


Core 6 about to decref PID 272's ref, was 2
Stack Backtrace on Core 6:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc2018e38>] in restart_kthread
#03 [<0xffffffffc2018eef>] in __launch_kthread
#04 [<0xffffffffc205d97f>] in process_routine_kmsg
#05 [<0xffffffffc20517b5>] in proc_restartcore

        5 syscalls

Core 6 about to decref PID 272's ref, was 1
Stack Backtrace on Core 6:
#01 [<0xffffffffc204f9d2>] in proc_decref
#02 [<0xffffffffc20501c4>] in __proc_disown_child
#03 [<0xffffffffc2059c3a>] in try_wait.isra.4
#04 [<0xffffffffc2059cb3>] in try_wait_any.isra.6
#05 [<0xffffffffc2059e1b>] in sys_waitpid
#06 [<0xffffffffc205c891>] in syscall
#07 [<0xffffffffc205cb88>] in run_local_syscall
#08 [<0xffffffffc2131c0a>] in sysenter_callwrapper

        parent

null eptp, gpc was 0xffff800009939aa0
null eptp, proc was 0xffff800007ec8d00
null eptp, pid was 272
null eptp, PCPU was 0xffff800009939aa0
null eptp, ref was  0

k, so we didn't have enough refs.  core 8 was actually current

        ah, core 8 was running.  this is dropping cur_proc.  i didn't count 
this one earlier, so we were actually off by one and should have started at 15.
        why exactly did this run?  this abandon is called after PRKM when we 
see owning_proc is clear.
                we should have received __death.  but we don't see the 
clear_owning_proc / PRKM that we see on core 6.
                perhaps we already yielded, dropping owning_proc, then came out 
and dropped cur_proc
                        no - since yield also abandons core
                let's see the RKMs too?
        i didn't see this decref the first time around, right? (above) (right)
                see it all the time now (3 in a row)

        since the count seems right at idle (11), it seems like the problem is 
the other core that comes and goes.
                meaning, the base 11: 5 syscalls, 4 alarms, parent, ksched, 
never change, and probably are fine
                there should be 2-4 refs for the two vcores that are running.  
clearly 2 for the active core.
                        both depend on the state of the core (owning, cur, etc).

                could spin for a while before destroying.  then print 
cur_proc/owning proc for all vcores (trace coretf -1)
                        might give clues if there's some corruption

        k, printed the RKMs for non-core 0:


                        RKM, core 2, 0xffffffffc204fa60 (__startcore)
                        
                        RKM, core 10, 0xffffffffc204d710 (__notify) (this is 
the kill event)

                        PID 274 core 10 killing self, refcnt pre  destroy 14
                        
                        Core 10 about to decref PID 274's ref, was 14
                        Stack Backtrace on Core 10:
                        #01 [<0xffffffffc204f9d2>] in proc_decref
                        #02 [<0xffffffffc20fc660>] in alarm_release
                        #03 [<0xffffffffc203cf58>] in chan_release
                        #04 [<0xffffffffc203c8ab>] in kref_put
                        #05 [<0xffffffffc200a5c5>] in tap_min_release
                        #06 [<0xffffffffc205f5cb>] in kref_put
                        #07 [<0xffffffffc2063f52>] in close_fdt
                        #08 [<0xffffffffc20514d1>] in proc_destroy
                        #09 [<0xffffffffc205a33c>] in sys_proc_destroy
                        #10 [<0xffffffffc205c8d1>] in syscall
                        #11 [<0xffffffffc205cbc8>] in run_local_syscall
                        RKM, core 2, 0xffffffffc2018ea0
                                        concurrent message, partway through the 
decref backtrace! (this is __launch_kthread)
                        #12 [<0xffffffffc2131c9a>] in sysenter_callwrapper
                        
                        
                        Core 2 about to decref PID 274's ref, was 13
                        Stack Backtrace on Core 2:
                        #01 [<0xffffffffc204f9d2>] in proc_decref
                        #02 [<0xffffffffc212ea97>] in __abandon_core
                        #03 [<0xffffffffc2051812>] in proc_restartcore
                        
                        so core 2 was likely running in userspace 
(__startcore). 
                        did it yield from lack of work yet?  
                                probably, since we don't see a __death message 
for it (not printed here)
                                if it yielded, it should have cleared 
owning_proc and current
                                though it probably entered via userspace 
(sys_proc_yield()), and hadn't completely idled yet.
                                        reason: proc_restartcore was in the BT
                        launch_kthread: if it restored a thread for the proc, 
it would load cur_proc, but not owning_proc
                                which would then get cleared in abandon_core.
                        given the timing, it looks like the kthread that woke 
up might be related to the alarm closing.
                                not sure why that would run on core 2.  qlocks 
and CVs and shit in there.  #alarm should run on core 0 though. (pcpui 0 tchain)

let's print a little more.

Core 12 about to decref PID 274's ref, was 14
Stack Backtrace on Core 12:
#01 [<0xffffffffc204fa62>] in proc_decref
#02 [<0xffffffffc20fc6f0>] in alarm_release
#03 [<0xffffffffc203cfe8>] in chan_release
#04 [<0xffffffffc203c93b>] in kref_put
#05 [<0xffffffffc200a5c5>] in tap_min_release
#06 [<0xffffffffc205f65b>] in kref_put
#07 [<0xffffffffc2063fe2>] in close_fdt
#08 [<0xffffffffc2051561>] in proc_destroy
RKM, core 13, 0xffffffffc2018ea0 (__launch_kthread)
#09 [<0xffffffffc205a3cc>] in sys_proc_destroy
core 13, launch kth, msg from 13, kth PID 274, cur_proc 0, sysc 
0x0000400001870ec0, flags 0x2, name block for 100000 usec, rip 
0xffffffffc20193ca
#10 [<0xffffffffc205c961>] in syscall
#11 [<0xffffffffc205cc58>] in run_local_syscall
#12 [<0xffffffffc2131d2a>] in sysenter_callwrapper


Core 13 about to decref PID 274's ref, was 13
Stack Backtrace on Core 13:
#01 [<0xffffffffc204fa62>] in proc_decref
#02 [<0xffffffffc212eb27>] in __abandon_core
#03 [<0xffffffffc20518a2>] in proc_restartcore

k, it was not #alarm, but a block syscall.  this was actually the timer tick, 
which is implemented with sys_block. 
        that sets a timer on the core it was on (not the best practice, btw), 
but explains why core 13 would get it.
        that rip is in sem_down btw, as expected for a sleeping kth.
        the flag is KTH_SAVE_ADDR_SPACE
        cur_proc == 0, which means PID 274 wasn't running there yet (probably 
just yielded, but that's unconfirmed).
        the kth clearly wants to be in 274's addr space.

we don't see an incref when that kth starts, since the kth's counted ref gets 
transferred to cur_proc

what this means is that this ref being dropped was actually from a syscall - 
one of our 5.  but we had 5 other decrefs that looked like syscalls aborting...
        did we have more than 5 syscalls?  that'd throw off the count
        our extra prints have some more info farther down in the trace


core 12, launch kth, msg from 12, kth PID 274, cur_proc 274, sysc 
0x0000400000816e80, flags 0x2, name block for 30000000 usec, rip 
0xffffffffc20193ca


Core 12 about to decref PID 274's ref, was 6
Stack Backtrace on Core 12:
#01 [<0xffffffffc204fa62>] in proc_decref
#02 [<0xffffffffc2018e89>] in restart_kthread
#03 [<0xffffffffc2018f69>] in __launch_kthread
#04 [<0xffffffffc205d9eb>] in process_routine_kmsg
#05 [<0xffffffffc2051845>] in proc_restartcore

        btw, the reason this looks different than the other __launch_kth case 
is because of where the decref happens.  if cur_proc is already kth->proc, we 
decref in restart_kthread


RKM, core 12, 0xffffffffc2018ea0
core 12, launch kth, msg from 12, kth PID 274, cur_proc 274, sysc 
0x0000400001872e20, flags 0x2, name read on fd 0, rip 0xffffffffc20193ca


Core 12 about to decref PID 274's ref, was 5
Stack Backtrace on Core 12:
#01 [<0xffffffffc204fa62>] in proc_decref
#02 [<0xffffffffc2018e89>] in restart_kthread
#03 [<0xffffffffc2018f69>] in __launch_kthread
#04 [<0xffffffffc205d9eb>] in process_routine_kmsg
#05 [<0xffffffffc2051845>] in proc_restartcore
RKM, core 12, 0xffffffffc2018ea0
core 12, launch kth, msg from 12, kth PID 274, cur_proc 274, sysc 
0x0000400001874db0, flags 0x2, name read on fd 18, rip 0xffffffffc20193ca


Core 12 about to decref PID 274's ref, was 4
Stack Backtrace on Core 12:
#01 [<0xffffffffc204fa62>] in proc_decref
#02 [<0xffffffffc2018e89>] in restart_kthread
#03 [<0xffffffffc2018f69>] in __launch_kthread
#04 [<0xffffffffc205d9eb>] in process_routine_kmsg
#05 [<0xffffffffc2051845>] in proc_restartcore
RKM, core 12, 0xffffffffc2018ea0
core 12, launch kth, msg from 12, kth PID 274, cur_proc 274, sysc 
0x0000400001878dc0, flags 0x2, name read on fd 20, rip 0xffffffffc20193ca


Core 12 about to decref PID 274's ref, was 3
Stack Backtrace on Core 12:
#01 [<0xffffffffc204fa62>] in proc_decref
#02 [<0xffffffffc2018e89>] in restart_kthread
#03 [<0xffffffffc2018f69>] in __launch_kthread
#04 [<0xffffffffc205d9eb>] in process_routine_kmsg
#05 [<0xffffffffc2051845>] in proc_restartcore
RKM, core 12, 0xffffffffc2018ea0
core 12, launch kth, msg from 12, kth PID 273, cur_proc 274, sysc 
0x00007f7fff9fe8b0, flags 0x2, name waitpid on -1, rip 0xffffffffc20193ca


Core 12 about to decref PID 274's ref, was 2
Stack Backtrace on Core 12:
#01 [<0xffffffffc204fa62>] in proc_decref
#02 [<0xffffffffc2018e38>] in restart_kthread
#03 [<0xffffffffc2018f69>] in __launch_kthread
#04 [<0xffffffffc205d9eb>] in process_routine_kmsg
#05 [<0xffffffffc2051845>] in proc_restartcore


here's the 5 decrefs for unblocked syscalls:
core 12, launch kth, msg from 12, kth PID 274, cur_proc 274, sysc 
0x0000400000816e80, flags 0x2, name block for 30000000 usec, rip 
0xffffffffc20193ca
core 12, launch kth, msg from 12, kth PID 274, cur_proc 274, sysc 
0x0000400001872e20, flags 0x2, name read on fd 0, rip 0xffffffffc20193ca
core 12, launch kth, msg from 12, kth PID 274, cur_proc 274, sysc 
0x0000400001874db0, flags 0x2, name read on fd 18, rip 0xffffffffc20193ca
core 12, launch kth, msg from 12, kth PID 274, cur_proc 274, sysc 
0x0000400001878dc0, flags 0x2, name read on fd 20, rip 0xffffffffc20193ca
core 12, launch kth, msg from 12, kth PID 273, cur_proc 274, sysc 
0x00007f7fff9fe8b0, flags 0x2, name waitpid on -1, rip 0xffffffffc20193ca

ok, that last one isn't our PID's syscall... that's our parent, waiting on us.
        it decreffed us in restart_kthread.  that's a different part of 
restart_kthread than the others (check the PCs)
        and it's fine - that's us dropping PID 274's ref, since right after 
this, PID 273 is cur_proc
        this is all due to signalling our parent.

so we're actually OK.  we had 5 syscalls, but not where we expected.  
however, cur_proc should be 273 at this point.  later in the trace, we dec'd 
another ref for cur_proc.

so next what happens:

Core 12 about to decref PID 274's ref, was 1
Stack Backtrace on Core 12:
#01 [<0xffffffffc204fa62>] in proc_decref
#02 [<0xffffffffc2050254>] in __proc_disown_child
#03 [<0xffffffffc2059d0a>] in try_wait.isra.4
#04 [<0xffffffffc2059d83>] in try_wait_any.isra.6
#05 [<0xffffffffc2059eeb>] in sys_waitpid
#06 [<0xffffffffc205c961>] in syscall
#07 [<0xffffffffc205cc58>] in run_local_syscall
#08 [<0xffffffffc2131d2a>] in sysenter_callwrapper

that's our parent, dropping our ref.  cur_proc should be 273, not 274.

but then:

null eptp, gpc was 0xffff8000098e39a0
null eptp, proc was 0xffff800007eca000
null eptp, pid was 274
null eptp, PCPU was 0xffff8000098e39a0
null eptp, ref was  0
        (debugging prints)
Entering Nanwan's Dungeon on Core 12 (Ints on):
Type 'help' for a list of commands.
HW TRAP frame at 0xfffffff000107e60 on core 12
  rax  0xfffffff000107f28
  rbx  0xffff8000098e39a0
  rcx  0x0000000000000001
  rdx  0x000000000000ced8
  rbp  0xfffffff000107f58
  rsi  0x0000000000000780
  rdi  0x000000000000000a
  r8   0xffffffffc7cc8d00
  r9   0x00000000000137e0
  r10  0xffffffffc7cc8c60
  r11  0xffffffffc7cc8c20
  r12  0xffffffffc7b87bc8
  r13  0xfffffff000107f28
  r14  0x00000000ffffffff
  r15  0x0000000000000000
  trap 0x00000006 Invalid Opcode
  gsbs 0xffffffffc7ccfbc0
  fsbs 0x0000000000000000
  err  0x--------00000000
  rip  0xffffffffc213642b
  cs   0x------------0008
  flag 0x0000000000010003
  rsp  0xfffffff000107f28
  ss   0x------------0010
ROS(Core 12)> bt 0xffffffffc213642b 0xfffffff000107f58
Backtrace from instruction 0xffffffffc213642b, with frame pointer 
0xfffffff000107f58

Backtrace of kernel context on Core 12:
#01 [<0xffffffffc213642b>] in vmx_clear_vmcs
#02 [<0xffffffffc212eb10>] in __abandon_core
#03 [<0xffffffffc20518a2>] in proc_restartcore

so proc_restartcore abandons, but for some reason it still thinks 274 is 
cur_proc, even though PRKM changed it.

ah, warning sirens are going off.

    struct per_cpu_info *pcpui = &per_cpu_info[core_id()];

    assert(!pcpui->cur_kthread->sysc);
    process_routine_kmsg();
    /* If there is no owning process, just idle, since we don't know what to do.
     * This could be because the process had been restarted a long time ago and
     * has since left the core, or due to a KMSG like __preempt or __death. */
    if (!pcpui->owning_proc) {
        abandon_core();
        smp_idle();
    }

with aggressive inlining and whatnot, could this be reading cur_proc before 
PRKM, then not noticing the change?
        unlikely, those funcs are in another C file.  confirmed via asm.

also, wait a sec - when 273's syscall kth restarts and finishes, why does it go 
back to proc_restartcore?
        ah, it's the proc_restartcore from 273.  it's walking back its 
sysenter/syscall TF.

or is cur_proc still legit 274?  (note that owning_proc should be 0, since we 
saw __death earlier, which cleared_owning_proc).
        if so, who changed it back?
        uh, a printk in abandon_core says it thinks cur_proc is 273...

ah, here's the issue
        __abandon_core() is given 273.  flushes PTs, decrefs ok, etc.  it's 
decreffing 273, not 274.
        so our refcounting is actually fine.
        the problem is when we abandon_core, we try to remove all the GPCs.  
the GPCs have an uncounted ref to proc
        that's why it looks like they correctly have a ref of 0 in the print
        we don't get to the 'extra' decref of 274, which i figured would 
happen, since we die in __invept
        had i commented that out, we would have kept going, then decreffed 273 
and not panicked.

overall, abandon_core() isn't "definitely remove the process", it's more like 
"remove the process if its still there".  (its listed as 
stop-whatever-was-there)
        there's an assumption in __abandon_core() when it calls the VMCS stuff 
that it is getting called on the process whos VCMS / GPC is there.
        the process was already removed before, and happened to already be 
fully decreffed, leaving it behind
        but we could be in another process. (the parent, in this case)
                because we never removed the VMM process, we replaced it's 
cur_proc in restart_kthread
        it might also be possible that there is no process on the core, perhaps 
if we ran a ktask kthread?
                not currently
        in short - the process could have left the core (the core is no long in 
that process's context; it is not cur_proc; and that ref is gone) before 
abandon core

a couple choices:
        refcnt gpc->proc's references.  they should all get cleaned up when we 
kill and ultimately abandon_core
                are there situations where a GPC / process will not get reaped, 
since we never abandon core?
                i think we flush the EPT when we changing owning_core or 
something (check this)
                wait a sec - abandon_core is removing the VMCS.  the GPC should 
have been unloaded already.
                        the GPC was unloaded when we finalized the context
                                when does that happen for dying processes?  (in 
__death)
        remove the GPCs when we remove the addr space / cur_proc
                and don't forget to remove the VMCS fully
                overkill: a sysc briefly pops into another user's addr space, 
(think of switch_to / send_event), then we unloaded the GPC/finalized the 
context

        
        need to think a little more
                the reason the VMCS cleanup was in abandon_core was i wanted to 
not leave any turds around when a process left a core.
                that's probably the wrong place for it.  should be in __death 
or __preempt.  any time we change owning_proc

[akaros] Re: [PATCH] x86: Flush the VMCS when changing owning_proc

Reply via email to