kmem pool for half pagesize is very wasteful
hi folks. while we were debugging some memory starvation issues i noticed that the "kmem-02048" pool only has 1 item per page on a system with 4KiB pages, same similarly "kmem-04096" on 8KiB page systems. i assume this also occurs on 16KiB page systems for the "kmem-08192" pool. this happens because the pool redzone increases the size from 2048 bytes to 2048+CACHE_LINE_SIZE bytes. this feels extremely wasteful to me. for the common 4K page size case, it's 2048+64 bytes of header of useful data, plus the 64 bytes lost for the redzone alignment, and thus 1920 bytes of lost space. the lost space is similarly just under 1/2 for large page size systems. for the smaller kmem pools, the lost space seems not great, but significantly less than the previous (1024*3 + 2*64*3 = 3456 bytes of useful info). 46.8% lost vs 15.6% lost, while not great, seems like a reasonable compromise to almost half the memory required for the kmem-02048 pool. this patch avoids this problem: https://www.netbsd.org/~mrg/poolwaste.diff comments? .mrg.
RE: timeouts connecting to pgsql database
Thanks Rob Is there any other recommendation .. using log lets us recover quickly after a server crash. -Original Message- From: Robert Swindells Sent: Saturday, February 20, 2021 3:15 PM To: Derrick Lobo Cc: tech-kern@NetBSD.org Subject: Re: timeouts connecting to pgsql database Derrick Lobo wrote: >Robert Swindells wrote >>Derrick Lobo wrote: >>>Not sure if anyone else has experienced the below, postgres configs >>>can be shared if needed, we have a few database servers running on >>>7.12 and 9 >>> >>>We recently noticed some problems with timeouts on some postgres >>>database servers. The machines don't appear to be heavily loaded, >>>although they are being used steadily. What we're seeing is that the machine is working fine. >>>No swapping, load average is below 1, and then it doesn't accept >>>database connections for about 10-20 seconds, and any queries on >>>active connections fail to return for that same time period. I tried >>>running these commands in a screen to get a better sense of the system state when the problem occurs: >> >>What filesystem options are you using for wherever the database files >>are located ? Are you using wapbl(4) ? > >Rw,log,noatime are the options used Thought it might be. I would not run a database on a filesystem with 'log' enabled, I think you will find that the delay is when it is flushing the log to disk.
Re: Check in cpu_switchto triggering crash with PARANOIA on
Oh I see. Thanks for the help. Looks like the check should be spl < IPL_SCHED instead of spl != IPL_SCHED. Seems it's just that the check is bad after all. Index: arch/mips/mips/locore.S === RCS file: /cvsroot/src/sys/arch/mips/mips/locore.S,v retrieving revision 1.226 diff -u -r1.226 locore.S --- arch/mips/mips/locore.S 26 Sep 2020 08:21:10 - 1.226 +++ arch/mips/mips/locore.S 22 Feb 2021 09:49:27 - @@ -224,7 +224,7 @@ PTR_L v0, L_CPU(MIPS_CURLWP) INT_L v1, CPU_INFO_CPL(v0) #if __mips >= 32 - tneiv1, IPL_SCHED + tltiu v1, IPL_SCHED #else li v0, IPL_SCHED 10:bne v0, v1, 10b On Sun, Feb 21, 2021 at 11:22 PM Nick Hudson wrote: > On 22/02/2021 04:15, Alan Fisher wrote: > > Hello, > > > > I've been trying to get the evbmips port working on a new chip recently, > > and in the process I've tried building the kernel with PARANOIA enabled. > > This has resulted in a crash on startup, and I am wondering if it is > > surfacing a bug. Here is what's happening: > > > > Some code under an #ifdef PARANOIA in cpu_switchto checks whether the > > IPL is IPL_SCHED, and if not, throws a trap. According to the manpage > > for cpu_switchto(9), the current IPL level being IPL_SCHED is a > > precondition for cpu_switchto(), so this check seems to make sense. The > > callstack looks like this: > > > > cpu_switchto - this causes a trap when the check fails - manpage says > > IPL must be IPL_SCHED > > mi_switch - manpage says IPL must be IPL_SCHED > > yield - manpage doesn't say anything about IPL_SCHED, and IPL is not > > changed in this routine > > > 276 yield(void) > 277 { > 278struct lwp *l = curlwp; > 279 > 280KERNEL_UNLOCK_ALL(l, >l_biglocks); > 281lwp_lock(l); > > lwp_lock will raise the IPL to IPL_SCHED. spc_{lock,mutex} are used by > lwp_lock (maybe others) > HTH, > > Nick >