On 11/24/15 03:31, Steven Chamberlain wrote:
Hi!

Would anyone like to try this change?  It's early to say if this
definitely fixed the issue for me, but it looks promising:

--- sys/kern/subr_pool.c
+++ sys/kern/subr_pool.c
@@ -259,5 +259,5 @@ pool_init(struct pool *pp, size_t size,
        if (pgsize - (size * items) > sizeof(struct pool_item_header)) {
                off = pgsize - sizeof(struct pool_item_header);
-       } else if (sizeof(struct pool_item_header) * 2 >= size) {
+       } else if (sizeof(struct pool_item_header) * 8 >= size) {
                off = pgsize - sizeof(struct pool_item_header);
                items = off / size;

Prior to v1.149, there was a threshold of I think PAGE_SIZE/16=512
on sparc64;  pools for an item size greater than that would use an in-
page header:

         * Decide whether to put the page header off page to avoid      
         * wasting too large a part of the page. Off-page page headers  
         * go into an RB tree, so we can match a returned item with     
         * its header based on the page address.        
         * We use 1/16 of the page size as the threshold (XXX: tune)    
         */     
        if (pp->pr_size < palloc->pa_pagesz/16 && pp->pr_size < PAGE_SIZE) {    
 
                /* Use the end of the page for the page header */

In v1.149 the threshold became sizeof(struct pool_item_header)*2=224 on
sparc64, so dma256 and dma512 pools would no longer use an in-page
header, but be able to accommodate more items per page as a result.

The adjustment above simply reverts that behavioural change.  It
probably never should have broken anything, other than slight
performance change, but it seems like it triggered some maybe pre-
existing bug elsewhere.

I've already ruled out the unsigned int arithmetic I've mentioned thus
far, with KASSERT()s that didn't trigger even when the crash happens.

And I've already tried to rule out cache colouring by forcing
pp->pr_maxcolors=0 to no avail.  (Since it was only used in pools
with an in-page header, it could have been related).

p.s. I would maybe even test if this helps with tmpfs issues seen on
armv7 and such, as I think that was first mentioned around the time of
this change, and since it uses pool(9) for its file metadata.

Regards,


Thanks.

Currently building a kernel on:
OpenBSD 5.8-current (GENERIC) #762: Tue Oct 20 15:52:00 MDT 2015
    [email protected]:/usr/src/sys/arch/sparc64/compile/GENERIC
real mem = 134217728 (128MB)
avail mem = 117440512 (112MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root: Sun Ultra 5/10 UPA/PCI (UltraSPARC-IIi 270MHz)
cpu0 at mainbus0: SUNW,UltraSPARC-IIi (rev 1.3) @ 270 MHz
cpu0: physical 16K instruction (32 b/l), 16K data (32 b/l), 256K external (64 b/l)

for my V100 which has all the issues:

OpenBSD 5.8-current (GENERIC) #801: Wed Nov 18 16:37:51 MST 2015
    [email protected]:/usr/src/sys/arch/sparc64/compile/GENERIC
real mem = 268435456 (256MB)
avail mem = 248193024 (236MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root: Sun Fire V100 (UltraSPARC-IIe 500MHz)
cpu0 at mainbus0: SUNW,UltraSPARC-IIe (rev 1.4) @ 500 MHz
cpu0: physical 16K instruction (32 b/l), 16K data (32 b/l), 256K external (64 b/l)

Will let you know how it goes.

Cheers

Fred


Reply via email to