Hi Steven,
I don't want to jump the gun, but after ~3 hours of heavy network i/o (ping -f
to and from, cvs checkout, ftp ... the stuff that crashed the box previously),
it is stable.
Thank you very much!
I will try to torture the box some more to see if it would behave ...
Jan
On Tue, Nov 24, 2015 at 03:31:53AM +0000, Steven Chamberlain wrote:
> Hi!
>
> Would anyone like to try this change? It's early to say if this
> definitely fixed the issue for me, but it looks promising:
>
> --- sys/kern/subr_pool.c
> +++ sys/kern/subr_pool.c
> @@ -259,5 +259,5 @@ pool_init(struct pool *pp, size_t size,
> if (pgsize - (size * items) > sizeof(struct pool_item_header)) {
> off = pgsize - sizeof(struct pool_item_header);
> - } else if (sizeof(struct pool_item_header) * 2 >= size) {
> + } else if (sizeof(struct pool_item_header) * 8 >= size) {
> off = pgsize - sizeof(struct pool_item_header);
> items = off / size;
>
> Prior to v1.149, there was a threshold of I think PAGE_SIZE/16=512
> on sparc64; pools for an item size greater than that would use an in-
> page header:
>
> * Decide whether to put the page header off page to avoid
> * wasting too large a part of the page. Off-page page headers
> * go into an RB tree, so we can match a returned item with
> * its header based on the page address.
> * We use 1/16 of the page size as the threshold (XXX: tune)
> */
> if (pp->pr_size < palloc->pa_pagesz/16 && pp->pr_size < PAGE_SIZE) {
>
> /* Use the end of the page for the page header */
>
> In v1.149 the threshold became sizeof(struct pool_item_header)*2=224 on
> sparc64, so dma256 and dma512 pools would no longer use an in-page
> header, but be able to accommodate more items per page as a result.
>
> The adjustment above simply reverts that behavioural change. It
> probably never should have broken anything, other than slight
> performance change, but it seems like it triggered some maybe pre-
> existing bug elsewhere.
>
> I've already ruled out the unsigned int arithmetic I've mentioned thus
> far, with KASSERT()s that didn't trigger even when the crash happens.
>
> And I've already tried to rule out cache colouring by forcing
> pp->pr_maxcolors=0 to no avail. (Since it was only used in pools
> with an in-page header, it could have been related).
>
> p.s. I would maybe even test if this helps with tmpfs issues seen on
> armv7 and such, as I think that was first mentioned around the time of
> this change, and since it uses pool(9) for its file metadata.
>
> Regards,
> --
> Steven Chamberlain
> [email protected]
--
Be the change you want to see in the world.