Am 19.07.2017 um 14:03 schrieb Steven Williamson:
> If the crash appears different every time check for hardware issues ?
we are pretty sure that the gm binary from GraphicsMagick causes this
Issue, theres nothing else strange on this host.
>
> Any memory errors or such logged in the OOB management logs, or see if>
> anything is reported in "fmdump -ev" ?
nope, nothing :| already checked
fmdump got some entries
[14:39:13][root@imgconvert-vmhost:/var/crash/volatile]$fmdump -ev
TIME CLASS ENA
Jul 13 09:40:13.4895 ereport.fm.fmd.module
0x80c1cdc3ff906801
Jul 13 10:58:13.8285 ereport.fm.fmd.module
0xc3e3d9ad20f06801
Jul 13 11:51:52.3469 ereport.fm.fmd.module
0xf2b9c8f7efc06801
Jul 13 12:40:21.9546 ereport.fm.fmd.module
0x1d10efdc5f106801
Jul 13 12:59:29.0118 ereport.fm.fmd.module
0x2ebc3a50d9a06801
Jul 14 17:52:27.2049 ereport.fm.fmd.module
0x16d81be693106801
Jul 14 22:49:44.1068 ereport.fm.fmd.module
0x19e9d25a42506801
Jul 15 14:22:07.9198 ereport.fm.fmd.module
0x487f6cbddb106801
Jul 15 23:01:59.0521 ereport.fm.fmd.module
0x0f666b8d2df06801
Jul 16 13:33:48.8724 ereport.fm.fmd.module
0x07925b1aafb06801
Jul 17 05:40:04.5401 ereport.fm.fmd.module
0x5342638f87106801
Jul 17 13:48:02.1728 ereport.fm.fmd.module
0xfd470998f1e06801
Jul 18 07:29:13.8224 ereport.fm.fmd.module
0x9ce55d18be106801
Jul 18 14:58:38.7127 ereport.fm.fmd.module
0x254ab8d50cf06801
Jul 19 09:11:04.6919 ereport.fm.fmd.module
0xdd8b5e6116506801
Jul 19 12:51:41.4838 ereport.fm.fmd.module
0x35e20f9af5d06801
[14:39:21][root@imgconvert-vmhost:/var/crash/volatile]$
Jul 13 09:40:13.5637 6deea02a-d655-48d3-c39c-c38a6d1c5158 FMD-8000-2K
Diagnosed
100% defect.sunos.fmd.module
Problem in: -
Affects: fmd:///module/software-diagnosis
FRU: -
Location: -
--------------- ------------------------------------ --------------
---------
TIME EVENT-ID MSG-ID
SEVERITY
--------------- ------------------------------------ --------------
---------
Jul 13 09:40:13 6deea02a-d655-48d3-c39c-c38a6d1c5158 FMD-8000-2K Minor
Platform : X10DRi Chassis_id : 123456789
Product_sn :
Fault class : defect.sunos.fmd.module
Affects : fmd:///module/software-diagnosis
faulted and taken out of service
FRU : None
faulty
Description : An illumos Fault Manager component has experienced an
error that
required the module to be disabled. Refer to
http://illumos.org/msg/FMD-8000-2K for more information.
Response : The module has been disabled. Events destined for the module
will be saved for manual diagnosis.
Impact : Automated diagnosis and response for subsequent events
associated
with this module will not occur.
Action : Use fmdump -v -u <EVENT-ID> to locate the module. Use fmadm
reset <module> to reset the module.
>
> Worth checking before digging into the dumps.
>
> On Wed, 19 Jul 2017 at 13:00 Jerry Jelinek <[email protected]
> <mailto:[email protected]>> wrote:
>
> I see that you are running a recent platform build
> (joyent_20170710T035256Z) so this does not appear to be a known bug.
> I filed OS-6238 to track this issue. We will want to get a copy of
> your dump so that we can fully debug this. Please contact me
> directly so we can arrange that.
>
> Thanks for reporting this and sorry for the problem,
> Jerry
>
>
> On Wed, Jul 19, 2017 at 2:46 AM, InterNetX - Juergen Gotteswinter
> <[email protected] <mailto:[email protected]>> wrote:
>
> Hello List,
>
> we are facing an issue with GraphicsMagic Convert Jobs inside an
> CentOS
> 7 LX Branded Zone.
>
> Inside the zone tons of pictures get converted via GraphicsMagick in
> batch jobs (proc count ~80 usually). Every few hours, the whole
> system
> panics, as far as my mdb skills tell me its not always the same
> reason.
>
> Maybe someone can take a look at this, currently we are somehow at a
> dead end. Full Core Dump Files can be supplied if needed.
>
> Thanks!
>
> Juergen
>
> > ::status
> debugging crash dump vmcore.14 (64-bit) from
> operating system: 5.11 joyent_20170710T035256Z (i86pc)
> image uuid: (not set)
> panic message: mutex_enter: bad mutex, lp=ffffd0198b416018
> owner=ffffd01989e9a8c0 thread=ffffd01985f477c0
> dump content: kernel pages only
> >
>
>
> > ::stack
> vpanic()
> mutex_panic+0x58(fffffffffb952692, ffffd0198b416018)
> mutex_vector_enter+0x347(ffffd0198b416018)
> priv_proc_cred_perm+0x48(ffffd01971187008, ffffd0198b416000, 0, 40)
> lxpr_doaccess+0xe0(ffffd0196fa06378, 0, 40, 0, ffffd01971187008, 0)
> lxpr_access+0x31(ffffd0196cf5d200, 40, 0, ffffd01971187008, 0)
> lxpr_lookup+0x59(ffffd0196cf5d200, ffffd00080f92980,
> ffffd00080f92978,
> ffffd00080f92bd0, 0, ffffd019695dcc80)
> fop_lookup+0xa3(ffffd0196cf5d200, ffffd00080f92980,
> ffffd00080f92978,
> ffffd00080f92bd0, 0, ffffd019695dcc80)
> lookuppnvp+0x230(ffffd00080f92bd0, 0, 0, 0, ffffd00080f92de0,
> ffffd019695dcc80)
> lookuppnatcred+0x176(ffffd00080f92bd0, 0, 0, 0, ffffd00080f92de0, 0)
> lookupnameatcred+0xdd(7fffffeff730, 0, 0, 0, ffffd00080f92de0, 0)
> lookupnameat+0x39(7fffffeff730, 0, 0, 0, ffffd00080f92de0, 0)
> readlinkat+0x9e(ffd19553, 7fffffeff730, 7fffff012440, 7f)
> lx_readlink+0x2c(7fffffeff730, 7fffff012440, 7f)
> lx_syscall_enter+0x16f()
> sys_syscall+0x142()
> >
>
> ::msgbuf
>
> ....
> ....
>
> bpf0 is /pseudo/bpf@0
> pseudo-device: pm0
> pm0 is /pseudo/pm@0
> pseudo-device: nsmb0
> nsmb0 is /pseudo/nsmb@0
> pseudo-device: lx_systrace0
> lx_systrace0 is /pseudo/lx_systrace@0
> NOTICE: vnic1011 unregistered
>
> panic[cpu23]/thread=ffffd01985f477c0:
> mutex_enter: bad mutex, lp=ffffd0198b416018 owner=ffffd01989e9a8c0
> thread=ffffd01985f477c0
>
>
> ffffd00080f925a0 unix:mutex_panic+58 ()
> ffffd00080f92610 unix:mutex_vector_enter+347 ()
> ffffd00080f92680 genunix:priv_proc_cred_perm+48 ()
> ffffd00080f92710 lx_proc:lxpr_doaccess+e0 ()
> ffffd00080f92750 lx_proc:lxpr_access+31 ()
> ffffd00080f927d0 lx_proc:lxpr_lookup+59 ()
> ffffd00080f92880 genunix:fop_lookup+a3 ()
> ffffd00080f92af0 genunix:lookuppnvp+230 ()
> ffffd00080f92b90 genunix:lookuppnatcred+176 ()
> ffffd00080f92ca0 genunix:lookupnameatcred+dd ()
> ffffd00080f92cf0 genunix:lookupnameat+39 ()
> ffffd00080f92e40 genunix:readlinkat+9e ()
> ffffd00080f92e70 lx_brand:lx_readlink+2c ()
> ffffd00080f92ef0 lx_brand:lx_syscall_enter+16f ()
> ffffd00080f92f10 unix:brand_sys_syscall+1bd ()
>
>
> > ::panicinfo
> cpu 23
> thread ffffd01985f477c0
> message mutex_enter: bad mutex, lp=ffffd0198b416018
> owner=ffffd01989e9a8c0 thread=ffffd01985f477c0
> rdi fffffffffb95265f
> rsi ffffd00080f92520
> rdx ffffd0198b416018
> rcx ffffd01989e9a8c0
> r8 ffffd01985f477c0
> r9 ffffd00080f925a0
> rax ffffd00080f92540
> rbx ffffd0198b416018
> rbp ffffd00080f92580
> r10 ffffd01986eb87a0
> r11 ffffd01985f477c0
> r12 0
> r13 0
> r14 ffffd0198b416000
> r15 40
> fsbase 0
> gsbase ffffd01947944580
> ds 38
> es 0
> fs 0
> gs 0
> trapno 0
> err 0
> rip fffffffffb863660
> cs 30
> rflags 282
> rsp ffffd00080f92518
> ss 38
> gdt_hi 0
> gdt_lo b00001ef
> idt_hi 0
> idt_lo a0000fff
> ldt 0
> task 70
> cr0 8005003b
> cr2 7fff84009000
> cr3 6203d1000
> cr4 3426f8
> >
>
>
>
>
> > ::memstat
>
>
> Page Summary Pages MB %Tot
> ------------ ---------------- ---------------- ----
> Kernel 1533188 5989 9%
> Boot pages 136438 532 1%
> ZFS File Data 2425754 9475 14%
> Anon 156478 611 1%
> Exec and libs 9699 37 0%
> Page cache 227327 887 1%
> Free (cachelist) 322248 1258 2%
> Free (freelist) 11935332 46622 71%
>
> Total 16746464 65415
> Physical 16746463 65415
> >
>
>
> > ::findleaks
> CACHE LEAKED BUFFER CALLER
> ffffd01932e79008 19329 ffffd01933d211f8 ?
> ffffd01932e81008 27420 ffffd019476fff28 ?
> ffffd01932e85008 1 ffffd019767e0350 ?
> ffffd01932e89008 36 ffffd019443ddf30 ?
> ffffd01932e8d008 8424 ffffd019548a73c0 ?
> ffffd01932e91008 196 ffffd01954bec418 ?
> ffffd01932e95008 14 ffffd01962691258 ?
> ffffd01932e99008 31 ffffd0196089cbc8 ?
> ffffd01932e9d008 17 ffffd019550e0e80 ?
> ffffd01932eb6008 1 ffffd019572ce780 ?
> ffffd01932f52008 14 ffffd01933923d60 ?
> ffffd01932f5a008 14 ffffd01933911080 ?
> ffffd0195600f008 2 ffffd0196f9b7ef0 ?
>
> ------------------------------------------------------------------------
> Total 55499 buffers, 2128680 bytes
>
>
>
>
> *smartos-discuss* | Archives
> <https://www.listbox.com/member/archive/184463/=now>
> <https://www.listbox.com/member/archive/rss/184463/26910370-9cc4a721> |
> Modify
> <https://www.listbox.com/member/?&>
> Your Subscription [Powered by Listbox] <http://www.listbox.com>
>
-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription:
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com