Re: Periodic messages on NetBSD-9 and -current: xennet0: rx no cluster
> the request count on the mclpl line is incrementing at a pretty fast rate Maybe you're running into the same problem as me (see the "mbuf cluster leak?" thread on tech-net). Try a kernel with MBUFTRACE. If that shows you (via netstat -mss) a large number of tx bufs on a particular vlan interface, try destroy-ing and re-creating that interface (and reloading ipfilter in case you're using it). For me, that stops the allocations from rising (for a while). I still don't know what triggers it, though.
Re: Periodic messages on NetBSD-9 and -current: xennet0: rx no cluster
hello. In looking at my vmstat-m output, I see: mclpl 211228146028146 14109 1407435 187 0 524288 35 I see no failures and the number of nmbclusters is: 524288 yet, this machine has displayed this message about 6 times since it was rebooted about 5 hours ago. Am I missing something? -thanks -Brian
Re: Periodic messages on NetBSD-9 and -current: xennet0: rx no cluster
hello. One strange thing I notice on this particular system that seems to be different from the other systems I'm running is that the request count on the mclpl line is incrementing at a pretty fast rate, where as on other systems, the request rate is, more or less, constant over time, with occasional bursts of requests. Even so, there are no failures noted, even though the driver says it's failed to get an rx cluster a few times since the system was booted. For example, since the last message I wrote, the mclpl line now looks like: mclpl 211229471029440 14801 1476239 187 0 524288 8 Maybe this incrementing thing isn't a big deal, but it jumps right out as being different. -thanks -Brian
Re: Periodic messages on NetBSD-9 and -current: xennet0: rx no cluster
hello. In looking at the if_xennet_xenbus.c file, I see where the if_xennetrxbuf_cache is initialized, but I don't see where data is put into it before it's requested. Is the idea that the items in the cache are supposed to be provided by the backend, i.e. the dom0? Is it possible that dom0 isn't providing enough rx requests to satisfy the traffic it's sending us? I think I understand what's supposed to happen once traffic begins flowing: rx requests come in, if_xennet_xenbus processes them and pushes them back into the if_xennetrxbuf_cache cache. and pushes them back into the if_xennetrxbuf_cache cache. What I don't understand is how the initial cache gets populated with free rx requests to use in order to get things started. -thanks -Brian
Re: Periodic messages on NetBSD-9 and -current: xennet0: rx no cluster
On Thu, Jun 23, 2022 at 01:48:55PM -0700, Brian Buhrow wrote: > hello. In looking at the if_xennet_xenbus.c file, I see where the > if_xennetrxbuf_cache is > initialized, but I don't see where data is put into it before it's requested. > Is the idea that > the items in the cache are supposed to be provided by the backend, i.e. the > dom0? Is it > possible that dom0 isn't providing enough rx requests to satisfy the traffic > it's sending us? I > think I understand what's supposed to happen once traffic begins flowing: rx > requests come in, > if_xennet_xenbus processes them and pushes them back into the > if_xennetrxbuf_cache cache. and > pushes them back into the if_xennetrxbuf_cache cache. What I don't > understand is how the > initial cache gets populated with free rx requests to use in order to get > things started. a pool cache has a backing pool. If there's no item in the pool cache, it gets some memory from its backing pool. The point of the cache here it to keep the physical address of items around, so it doesn't have to be computed again -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Re: Periodic messages on NetBSD-9 and -current: xennet0: rx no cluster
On Thu, Jun 23, 2022 at 12:54:59PM -0700, Brian Buhrow wrote: > hello. In looking at my vmstat-m output, I see: > > mclpl 211228146028146 14109 1407435 187 0 524288 > 35 > > I see no failures and the number of nmbclusters is: 524288 > > yet, this machine has displayed this message about 6 times since it was > rebooted about 5 hours > ago. > > Am I missing something? OK, so this is -current; it is the if_xennetrxbuf_cache pool cache which is failing. This one has no limits. -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Re: Periodic messages on NetBSD-9 and -current: xennet0: rx no cluster
On Thu, Jun 23, 2022 at 12:29:11PM -0700, Brian Buhrow wrote: > Hello. I'm running a number of NetBSD-9 and -current as of 99.77 > amd/64 domu machines on > a couple of different servers with FreeBSD as dom0. I'm getting the > following messages from > the kernel: > xennet0: rx no cluster > Much of the time, these messages seem harmless, but occasionally, the network > locks up on > machines that display this message. > > In looking at the source code, I get that this is a pool allocation failure in > if_xennet_xenbus.c, but I don't understand which memory resource it's running > out of and if > there is a way to increase that resource. In general, the domu's in question > seem to have > plenty of memory and I don't see a lot of memoory pressure for other tasks on > the systems. > > Has anyone else seen these messages on their domu machines and does > anyone have ideas on > how to correct the issue? It's running out of mbuf clusters; this is the mclpl in vmstat -m You can try increasing kern.mbuf.nmbclusters, or if that fail, rebuilding a kernel with options NMBCLUSTERS= e.g. options NMBCLUSTERS=65536 -- Manuel Bouyer NetBSD: 26 ans d'experience feront toujours la difference --
Periodic messages on NetBSD-9 and -current: xennet0: rx no cluster
Hello. I'm running a number of NetBSD-9 and -current as of 99.77 amd/64 domu machines on a couple of different servers with FreeBSD as dom0. I'm getting the following messages from the kernel: xennet0: rx no cluster Much of the time, these messages seem harmless, but occasionally, the network locks up on machines that display this message. In looking at the source code, I get that this is a pool allocation failure in if_xennet_xenbus.c, but I don't understand which memory resource it's running out of and if there is a way to increase that resource. In general, the domu's in question seem to have plenty of memory and I don't see a lot of memoory pressure for other tasks on the systems. Has anyone else seen these messages on their domu machines and does anyone have ideas on how to correct the issue? -thanks -Brian