destroying non-empty racct

2017-06-09 Thread Larry Rosenman
I know we had this a while back, and it was fixed, but it's back.


Dump header from device: /dev/mfid0p3
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 105984
  Blocksize: 512
  Dumptime: Fri Jun  9 13:25:25 2017
  Hostname: borg.lerctr.org
  Magic: FreeBSD Text Dump
  Version String: FreeBSD 12.0-CURRENT #29 r319458: Thu Jun  1 16:15:44 CDT 2017
r...@borg.lerctr.org:/usr/obj/usr/src/sys/VT-LER
  Panic String: destroying non-empty racct: 4124672 allocated for resource 4

  Dump Parity: 1629450792
  Bounds: 3
  Dump Status: good

I do NOT have a core due to insufficient swap (I do have a text dump).

Ideas?


-- 
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 214-642-9640 E-Mail: l...@lerctr.org
US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: destroying non-empty racct: 2113536 allocated for resource 4

2016-06-06 Thread Andriy Gapon

I've just got the same panic again.
This time I didn't do anything unusual, just ran a poudriere build and the
systems paniced at the end of it:

Unread portion of the kernel message buffer:
panic: destroying non-empty racct: 2113536 allocated for resource 4

KDB: stack backtrace:
db_trace_self_wrapper() at 0x804131eb = db_trace_self_wrapper+0x2b/frame
0xfe051992a7f0
kdb_backtrace() at 0x806636d9 = kdb_backtrace+0x39/frame 
0xfe051992a8a0
vpanic() at 0x8062dd9c = vpanic+0x14c/frame 0xfe051992a8e0
panic() at 0x8062dae3 = panic+0x43/frame 0xfe051992a940
racct_destroy_locked() at 0x8061eebc = racct_destroy_locked+0xac/frame
0xfe051992a960
racct_destroy() at 0x8061ede5 = racct_destroy+0x35/frame 
0xfe051992a980
prison_racct_free_locked() at 0x805fdcdc =
prison_racct_free_locked+0x4c/frame 0xfe051992a9a0
prison_racct_free() at 0x805fdc2d = prison_racct_free+0x6d/frame
0xfe051992a9c0
prison_racct_detach() at 0x805fdd8e = prison_racct_detach+0x3e/frame
0xfe051992a9e0
prison_deref() at 0x805fb26b = prison_deref+0x23b/frame 
0xfe051992aa10
prison_remove_one() at 0x805fc9c5 = prison_remove_one+0x125/frame
0xfe051992aa40
sys_jail_remove() at 0x805fc884 = sys_jail_remove+0x204/frame
0xfe051992aa90
syscallenter() at 0x80820cdd = syscallenter+0x31d/frame 
0xfe051992ab00
amd64_syscall() at 0x808208af = amd64_syscall+0x1f/frame 
0xfe051992abf0
Xfast_syscall() at 0x80808d5b = Xfast_syscall+0xfb/frame 
0xfe051992abf0

It's interesting that the resource and the value are exactly the same.
I have a crash dump this time as well.


On 17/05/2016 09:22, Andriy Gapon wrote:
> 
> To be fair I got this panic after some exotic sequence of events: running
> poudriere, sending SIGSTOP to one of build processes, forgetting about it,
> seeing poudriere timeout that job, sending SIGCONT...
> 
> This is amd64 head r297350.
> 
> Some details:
> (kgdb) bt
> #0  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:295
> #1  0x8062d7ef in kern_reboot (howto=) at
> /usr/src/sys/kern/kern_shutdown.c:363
> #2  0x8062de38 in vpanic (fmt=, ap=0xfe0519b73920) 
> at
> /usr/src/sys/kern/kern_shutdown.c:639
> #3  0x8062db43 in panic (fmt=) at
> /usr/src/sys/kern/kern_shutdown.c:572
> #4  0x8061ef1c in racct_destroy_locked (racctp=) at
> /usr/src/sys/kern/kern_racct.c:478
> #5  0x8061ee45 in racct_destroy (racct=0xf802f6301518) at
> /usr/src/sys/kern/kern_racct.c:495
> #6  0x805fdd3c in prison_racct_free_locked (prr=0xf802f6301400) at
> /usr/src/sys/kern/kern_jail.c:4564
> #7  0x805fdc8d in prison_racct_free (prr=0xf802f6301400) at
> /usr/src/sys/kern/kern_jail.c:4583
> #8  0x805fddee in prison_racct_detach (pr=0xf802b073) at
> /usr/src/sys/kern/kern_jail.c:4658
> #9  0x805fb2cb in prison_deref (pr=, flags=3) at
> /usr/src/sys/kern/kern_jail.c:2663
> #10 0x805fca25 in prison_remove_one (pr=) at
> /usr/src/sys/kern/kern_jail.c:2358
> #11 0x805fc8e4 in sys_jail_remove (td=, uap= out>) at /usr/src/sys/kern/kern_jail.c:2313
> #12 0x80820ddd in syscallenter (td=0xf801146019e0,
> sa=0xfe0519b73b80) at 
> /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135
> #13 0x808209af in amd64_syscall (td=0xf801146019e0, traced=0) at
> /usr/src/sys/amd64/amd64/trap.c:943
> 
> RACCT_RSS is 4.
> 
> (kgdb) p *prr
> $5 = {
>   prr_next = {
> le_next = 0xf80382fe4400,
> le_prev = 0xf8017ac90600
>   },
>   prr_name = "basejail-default-job-03", '\000' ,
>   prr_refcount = 0,
>   prr_racct = 0xf802e3f520b0
> }
> (kgdb) p *prr->prr_racct
> $6 = {
>   r_resources = {13884177072, 0, 0, 0, 2113536, 0 ,
> 13611325009, 0},
>   r_rule_links = {
> lh_first = 0x0
>   }
> }
> 
> Could it be that somehow the CONT'd process failed to deduct its resources 
> from
> the jail's resources because the jail was already marked for destruction or
> something like that?
> 


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


panic: destroying non-empty racct: 2113536 allocated for resource 4

2016-05-17 Thread Andriy Gapon

To be fair I got this panic after some exotic sequence of events: running
poudriere, sending SIGSTOP to one of build processes, forgetting about it,
seeing poudriere timeout that job, sending SIGCONT...

This is amd64 head r297350.

Some details:
(kgdb) bt
#0  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:295
#1  0x8062d7ef in kern_reboot (howto=) at
/usr/src/sys/kern/kern_shutdown.c:363
#2  0x8062de38 in vpanic (fmt=, ap=0xfe0519b73920) at
/usr/src/sys/kern/kern_shutdown.c:639
#3  0x8062db43 in panic (fmt=) at
/usr/src/sys/kern/kern_shutdown.c:572
#4  0x8061ef1c in racct_destroy_locked (racctp=) at
/usr/src/sys/kern/kern_racct.c:478
#5  0x8061ee45 in racct_destroy (racct=0xf802f6301518) at
/usr/src/sys/kern/kern_racct.c:495
#6  0x805fdd3c in prison_racct_free_locked (prr=0xf802f6301400) at
/usr/src/sys/kern/kern_jail.c:4564
#7  0x805fdc8d in prison_racct_free (prr=0xf802f6301400) at
/usr/src/sys/kern/kern_jail.c:4583
#8  0x805fddee in prison_racct_detach (pr=0xf802b073) at
/usr/src/sys/kern/kern_jail.c:4658
#9  0x805fb2cb in prison_deref (pr=, flags=3) at
/usr/src/sys/kern/kern_jail.c:2663
#10 0x805fca25 in prison_remove_one (pr=) at
/usr/src/sys/kern/kern_jail.c:2358
#11 0x805fc8e4 in sys_jail_remove (td=, uap=) at /usr/src/sys/kern/kern_jail.c:2313
#12 0x80820ddd in syscallenter (td=0xf801146019e0,
sa=0xfe0519b73b80) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135
#13 0x808209af in amd64_syscall (td=0xf801146019e0, traced=0) at
/usr/src/sys/amd64/amd64/trap.c:943

RACCT_RSS is 4.

(kgdb) p *prr
$5 = {
  prr_next = {
le_next = 0xf80382fe4400,
le_prev = 0xf8017ac90600
  },
  prr_name = "basejail-default-job-03", '\000' ,
  prr_refcount = 0,
  prr_racct = 0xf802e3f520b0
}
(kgdb) p *prr->prr_racct
$6 = {
  r_resources = {13884177072, 0, 0, 0, 2113536, 0 ,
13611325009, 0},
  r_rule_links = {
lh_first = 0x0
  }
}

Could it be that somehow the CONT'd process failed to deduct its resources from
the jail's resources because the jail was already marked for destruction or
something like that?

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"