Re: r323412: Panic on boot (slab->us_keg == keg)
On Tue, Sep 12, 2017 at 10:34:00AM +0200, Raphael Kubo da Costa wrote: > Mark Johnston writes: > > > I think the bug is that keg_large_init() doesn't take > > sizeof(struct uma_slab) into account when setting uk_ppera for the keg. > > In particular, the bug isn't specific to the bootup process; it only > > affects internal zones with an item size in the range [4016, 4096]. > > > > The patch below should fix this - could you give it a try? > > I've tried it and can confirm it fixed the panic here. Thanks, committed as r323544. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: r323412: Panic on boot (slab->us_keg == keg)
On 12.09.2017 06:35, Mark Johnston wrote: >> [...] >> FreeBSD/SMP: 2 package(s) x 14 core(s) x 2 hardware threads >> >> Also I determined that it can successfully boot with disabled >> hyper-threading. > > After the change to CACHE_LINE_SIZE, we have > sizeof(struct uma_zone) == 448 and sizeof(struct uma_cache) == 64. With > 56 CPUs, we therefore need 4032 bytes per UMA zone, plus 80 bytes for > the slab header - "internal" zones always keep the slab header in the > slab itself. That's slightly larger than one page, but the UMA zone > zone's keg will have uk_ppera == 1. So, when allocating slabzone, > keg_alloc_slab() will call startup_alloc(uk_ppera * PAGE_SIZE), which > will allocate 4096 bytes for a structure that is 4032 + 80 = 4112 bytes > in size. > > I think the bug is that keg_large_init() doesn't take > sizeof(struct uma_slab) into account when setting uk_ppera for the keg. > In particular, the bug isn't specific to the bootup process; it only > affects internal zones with an item size in the range [4016, 4096]. > > The patch below should fix this - could you give it a try? Hi Mark, I can confirm that it fixes this panic. Thanks! -- WBR, Andrey V. Elsukov signature.asc Description: OpenPGP digital signature
Re: r323412: Panic on boot (slab->us_keg == keg)
Mark Johnston writes: > I think the bug is that keg_large_init() doesn't take > sizeof(struct uma_slab) into account when setting uk_ppera for the keg. > In particular, the bug isn't specific to the bootup process; it only > affects internal zones with an item size in the range [4016, 4096]. > > The patch below should fix this - could you give it a try? I've tried it and can confirm it fixed the panic here. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: r323412: Panic on boot (slab->us_keg == keg)
On Mon, Sep 11, 2017 at 09:15:51PM +0300, Andrey V. Elsukov wrote: > On 11.09.2017 15:23, Andrey V. Elsukov wrote: > > --- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp = > > 0x821939b0 --- > > zone_import() at zone_import+0x110/frame 0x821939b0 > > zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0 > > uma_startup() at uma_startup+0x1d0/frame 0x82193ae0 > > vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30 > > vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50 > > mi_startup() at mi_startup+0x9c/frame 0x82193b70 > > btext() at btext+0x2c > > Uptime: 1s > > I bisected revisions, and the last working is r322988. > This machine is E5-2660 v4@ based. > > [...] > FreeBSD/SMP: 2 package(s) x 14 core(s) x 2 hardware threads > > Also I determined that it can successfully boot with disabled > hyper-threading. After the change to CACHE_LINE_SIZE, we have sizeof(struct uma_zone) == 448 and sizeof(struct uma_cache) == 64. With 56 CPUs, we therefore need 4032 bytes per UMA zone, plus 80 bytes for the slab header - "internal" zones always keep the slab header in the slab itself. That's slightly larger than one page, but the UMA zone zone's keg will have uk_ppera == 1. So, when allocating slabzone, keg_alloc_slab() will call startup_alloc(uk_ppera * PAGE_SIZE), which will allocate 4096 bytes for a structure that is 4032 + 80 = 4112 bytes in size. I think the bug is that keg_large_init() doesn't take sizeof(struct uma_slab) into account when setting uk_ppera for the keg. In particular, the bug isn't specific to the bootup process; it only affects internal zones with an item size in the range [4016, 4096]. The patch below should fix this - could you give it a try? diff --git a/sys/vm/uma_core.c b/sys/vm/uma_core.c index 44c91e66769a..48daeb18f9c3 100644 --- a/sys/vm/uma_core.c +++ b/sys/vm/uma_core.c @@ -1306,10 +1306,6 @@ keg_large_init(uma_keg_t keg) keg->uk_ipers = 1; keg->uk_rsize = keg->uk_size; - /* We can't do OFFPAGE if we're internal, bail out here. */ - if (keg->uk_flags & UMA_ZFLAG_INTERNAL) - return; - /* Check whether we have enough space to not do OFFPAGE. */ if ((keg->uk_flags & UMA_ZONE_OFFPAGE) == 0) { shsize = sizeof(struct uma_slab); @@ -1317,8 +1313,17 @@ keg_large_init(uma_keg_t keg) shsize = (shsize & ~UMA_ALIGN_PTR) + (UMA_ALIGN_PTR + 1); - if ((PAGE_SIZE * keg->uk_ppera) - keg->uk_rsize < shsize) - keg->uk_flags |= UMA_ZONE_OFFPAGE; + if ((PAGE_SIZE * keg->uk_ppera) - keg->uk_rsize < shsize) { + /* +* We can't do offpage if we're internal, in which case +* we need an extra page per allocation to contain the +* slab header. +*/ + if ((keg->uk_flags & UMA_ZFLAG_INTERNAL) == 0) + keg->uk_flags |= UMA_ZONE_OFFPAGE; + else + keg->uk_ppera++; + } } if ((keg->uk_flags & UMA_ZONE_OFFPAGE) && diff --git a/sys/vm/vm_page.c b/sys/vm/vm_page.c index ee7b93bbd719..477a816b0bd2 100644 --- a/sys/vm/vm_page.c +++ b/sys/vm/vm_page.c @@ -475,7 +475,8 @@ vm_page_startup(vm_offset_t vaddr) * in proportion to the zone structure size. */ pages_per_zone = howmany(sizeof(struct uma_zone) + - sizeof(struct uma_cache) * (mp_maxid + 1), UMA_SLAB_SIZE); + sizeof(struct uma_slab) + sizeof(struct uma_cache) * (mp_maxid + 1), + UMA_SLAB_SIZE); if (pages_per_zone > 1) { /* Reserve more pages so that we don't run out. */ boot_pages = UMA_BOOT_PAGES_ZONES * pages_per_zone; ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: r323412: Panic on boot (slab->us_keg == keg)
Raphael Kubo da Costa writes: > "Andrey V. Elsukov" writes: > >> On 11.09.2017 15:23, Andrey V. Elsukov wrote: >> >>> --- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp = >>> 0x821939b0 --- >>> zone_import() at zone_import+0x110/frame 0x821939b0 >>> zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0 >>> uma_startup() at uma_startup+0x1d0/frame 0x82193ae0 >>> vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30 >>> vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50 >>> mi_startup() at mi_startup+0x9c/frame 0x82193b70 >>> btext() at btext+0x2c >>> Uptime: 1s >> >> I bisected revisions, and the last working is r322988. > > [...] > >> Also I determined that it can successfully boot with disabled >> hyper-threading. > > Did you mistype the revision number? r322988 is "rtwn(4): some initial > preparations for (basic) VHT support" by avos@. Sorry for the brain fart. I can confirm that reverting r322989 ("Drop CACHE_LINE_SIZE to 64 bytes on x86") here on top of r323412 allows the boot to proceed here. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: r323412: Panic on boot (slab->us_keg == keg)
On 11.09.2017 21:38, John Baldwin wrote: > On Monday, September 11, 2017 09:15:51 PM Andrey V. Elsukov wrote: >> On 11.09.2017 15:23, Andrey V. Elsukov wrote: >>> --- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp = >>> 0x821939b0 --- >>> zone_import() at zone_import+0x110/frame 0x821939b0 >>> zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0 >>> uma_startup() at uma_startup+0x1d0/frame 0x82193ae0 >>> vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30 >>> vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50 >>> mi_startup() at mi_startup+0x9c/frame 0x82193b70 >>> btext() at btext+0x2c >>> Uptime: 1s >> >> I bisected revisions, and the last working is r322988. >> This machine is E5-2660 v4@ based. > > If you just revert r322988 on a newer tree does it work ok? r322988 works, reverting r322989 (commit about CACHELINE) does help. -- WBR, Andrey V. Elsukov signature.asc Description: OpenPGP digital signature
Re: r323412: Panic on boot (slab->us_keg == keg)
"Andrey V. Elsukov" writes: > On 11.09.2017 15:23, Andrey V. Elsukov wrote: > >> --- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp = >> 0x821939b0 --- >> zone_import() at zone_import+0x110/frame 0x821939b0 >> zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0 >> uma_startup() at uma_startup+0x1d0/frame 0x82193ae0 >> vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30 >> vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50 >> mi_startup() at mi_startup+0x9c/frame 0x82193b70 >> btext() at btext+0x2c >> Uptime: 1s > > I bisected revisions, and the last working is r322988. [...] > Also I determined that it can successfully boot with disabled > hyper-threading. Did you mistype the revision number? r322988 is "rtwn(4): some initial preparations for (basic) VHT support" by avos@. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: r323412: Panic on boot (slab->us_keg == keg)
On Monday, September 11, 2017 09:15:51 PM Andrey V. Elsukov wrote: > On 11.09.2017 15:23, Andrey V. Elsukov wrote: > > --- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp = > > 0x821939b0 --- > > zone_import() at zone_import+0x110/frame 0x821939b0 > > zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0 > > uma_startup() at uma_startup+0x1d0/frame 0x82193ae0 > > vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30 > > vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50 > > mi_startup() at mi_startup+0x9c/frame 0x82193b70 > > btext() at btext+0x2c > > Uptime: 1s > > I bisected revisions, and the last working is r322988. > This machine is E5-2660 v4@ based. If you just revert r322988 on a newer tree does it work ok? -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: r323412: Panic on boot (slab->us_keg == keg)
On 11.09.2017 15:23, Andrey V. Elsukov wrote: > --- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp = > 0x821939b0 --- > zone_import() at zone_import+0x110/frame 0x821939b0 > zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0 > uma_startup() at uma_startup+0x1d0/frame 0x82193ae0 > vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30 > vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50 > mi_startup() at mi_startup+0x9c/frame 0x82193b70 > btext() at btext+0x2c > Uptime: 1s I bisected revisions, and the last working is r322988. This machine is E5-2660 v4@ based. CPU: Intel(R) Xeon(R) CPU E5-2660 v4@ 2.00GHz (2000.04-MHz K8-class CPU) Origin="GenuineIntel" Id=0x406f1 Family=0x6 Model=0x4f Stepping=1 Features=0xbfebfbff Features2=0x7ffefbff AMD Features=0x2c100800 AMD Features2=0x121 Structured Extended Features=0x21cbfbb XSAVE Features=0x1 VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr TSC: P-state invariant, performance statistics real memory = 68719476736 (65536 MB) avail memory = 66562076672 (63478 MB) Event timer "LAPIC" quality 600 ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 56 CPUs FreeBSD/SMP: 2 package(s) x 14 core(s) x 2 hardware threads Also I determined that it can successfully boot with disabled hyper-threading. -- WBR, Andrey V. Elsukov signature.asc Description: OpenPGP digital signature
Re: r323412: Panic on boot (slab->us_keg == keg)
On 11.09.2017 11:31, Raphael Kubo da Costa wrote: > I've recently tried to upgrade a HEAD VM (running on a Linux host with > QEMU) from r321082 to r323412. > > The new kernel panics right after I try to boot into it with: > > panic: Assertion slab->us_keg == keg failed at /usr/src/sys/vm/uma_core.c:2285 > cpuid = 0 > time = 1 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0x81c4d780 > vpanic() at vpanic+0x19c/frame 0x81c4d800 > kassert_panic() at kassert_panic+0x126/frame 0x81c4d870 > keg_fetch_slab() at keg_fetch_slab+0x2a9/frame 0x81c4d8c0 > zone_fetch_slab() at zone_fetch_slab+0x51/frame 0x81c4d8f0 > zone_import() at zone_import+0x4f/frame 0x81c4d960 > zone_alloc_item() at zone_alloc_item+0x36/frame 0x81c4d9a0 > uma_zcreate() at uma_zcreate+0x3d3/frame 0x81c4da40 > uma_startup() at uma_startup+0x147/frame 0x81c4dae0 > vm_page_startup() at vm_page_startup+0x34e/frame 0x81c4db30 > vm_mem_init() at vm_mem_init+0x1a/frame 0x81c4db50 > mi_startup() at mi_startup+0x9c/frame 0x81c4db70 > btext() at btext+0x2c > KDB: enter: panic > [ thread 0 pid 0 tid 0 ] I have r323177 based system without INVARIANTS that panics at netboot with similar trace: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x84 fault code = supervisor read data, page not present instruction pointer = 0x20:0x80d84870 stack pointer = 0x28:0x82193970 frame pointer = 0x28:0x821939b0 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= resume, IOPL = 0 current process = 0 () trap number = 12 panic: page fault cpuid = 0 time = 1 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0x82193550 vpanic() at vpanic+0x19c/frame 0x821935d0 panic() at panic+0x43/frame 0x82193630 trap_fatal() at trap_fatal+0x34d/frame 0x82193680 trap_pfault() at trap_pfault+0x49/frame 0x821936e0 trap() at trap+0x2a9/frame 0x821938a0 calltrap() at calltrap+0x8/frame 0x821938a0 --- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp = 0x821939b0 --- zone_import() at zone_import+0x110/frame 0x821939b0 zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0 uma_startup() at uma_startup+0x1d0/frame 0x82193ae0 vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30 vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50 mi_startup() at mi_startup+0x9c/frame 0x82193b70 btext() at btext+0x2c Uptime: 1s -- WBR, Andrey V. Elsukov signature.asc Description: OpenPGP digital signature