Re: "zfs send" freezes system (was: Re: pgdaemon high CPU consumption)

2022-07-28 Thread Chuck Silvers
On Tue, Jul 19, 2022 at 08:46:07AM +0200, Matthias Petermann wrote:
> Hello,
> 
> On 13.07.22 12:30, Matthias Petermann wrote:
> 
> > I can now confirm that reverting the patch also solved my problem. Of
> > course I first fell into the trap, because I had not considered that the
> > ZFS code is loaded as a module and had only changed the kernel. As a
> > result, it looked at first as if this would not help. Finally it did...I
> > am now glad that I can use a zfs send again in this way. This previously
> > led reproducibly to a crash, whereby I could not make backups. This is
> > critical for me and I would like to support tests regarding this.
> > 
> > In contrast to the PR, there are hardly any xcalls in my use case -
> > however, my system only has 4 CPU cores, 2 of which are physical.
> > 
> > 
> > Many greetings
> > Matthias
> > 
> 
> Roundabout one week after removing the patch, my system with ZFS is behaving
> "normally" for the most part and the freezes have disappeared. What is the
> recommended way given the 10 branch? If it is not foreseeable that the basic
> problem can be solved shortly, would it also be an option to withdraw the
> patch in the sources to get at least a stable behavior? (Not only) on the
> sidelines, I would still be interested in whether this "zfs send" problem
> occurs in general, or whether certain hardware requirements have a favorable
> effect on it.
> 
> Kind regards
> Matthias


hi, sorry for the delay in getting to this.

what is happening here is that the pagedaemon is hitting the check for
"uvm_km_va_starved_p()", which tries to keep the usage of kernel memory
below 90% of the virtual space available for kernel memory.  the checks that
I changed (effectively removed for 64-bit kernels) in that previous patch
tried to keep the ARC kernel memory usage below 75% of the kernel virtual space.
on other OSs that support ZFS, the kernel allocates enough virtual space for
the kernel be able to allocate almost all of RAM for itself if it wants,
but on netbsd we have this calculation in kmeminit_nkmempages():


#if defined(KMSAN)
npages = (physmem / 8);
#elif defined(PMAP_MAP_POOLPAGE)
npages = (physmem / 4);
#else
npages = (physmem / 3) * 2;
#endif /* defined(PMAP_MAP_POOLPAGE) */

#ifndef NKMEMPAGES_MAX_UNLIMITED
if (npages > NKMEMPAGES_MAX)
npages = NKMEMPAGES_MAX;
#endif



this limits the amount of kernel memory to 1/4 of RAM on 64-bit platforms.
PMAP_MAP_POOLPAGE is for accessing pool objects that are smaller than a page
using a direct-mapped region of virtual addresses.  all 64-bit kernels can
do this... though it looks like sparc64 doesn't do this for such pool 
allocations
even though it could?  weird.

most 64-bit kernels also define NKMEMPAGES_MAX_UNLIMITED to indicate that
no arbitrary fixed limit should be imposed on kernel memory usage.
though again not all platforms that could define this actually do.
this time it's the mips kernels that don't enable this one.

for ZFS, the memory used for the ARC cache is allocated through pools
but the allocation sizes are almost all larger than a page,
so basically none of these allocations will be able to use the direct map,
and instead they will all have to allocate kernel virtual space.
I don't think it makes sense for the kernel to arbitrarily limit
the ZFS ARC cache to 1/4 of RAM just because that's how much virtual space
is made available for kernel memory mappings, so instead I think we should
increase the size of the kernel virtual space on 64-bit kernels to support
mapping all of RAM, something like the attached patch.

however even with this change, reading an bunch of data into the ZFS ARC
still results in the system hanging, this time due to running out of
physical memory.  there are other mechanisms that ZFS also uses to try to
control its memory usage, and some part of that is apparently not
working either.  I'm continuing to look into this.

-Chuck
Index: src/sys/uvm/uvm_km.c
===
RCS file: /home/chs/netbsd/cvs/src/sys/uvm/uvm_km.c,v
retrieving revision 1.160
diff -u -p -r1.160 uvm_km.c
--- src/sys/uvm/uvm_km.c13 Mar 2021 15:29:55 -  1.160
+++ src/sys/uvm/uvm_km.c26 Jul 2022 20:24:14 -
@@ -237,6 +237,8 @@ kmeminit_nkmempages(void)
 #ifndef NKMEMPAGES_MAX_UNLIMITED
if (npages > NKMEMPAGES_MAX)
npages = NKMEMPAGES_MAX;
+#else
+   npages = physmem;
 #endif
 
if (npages < NKMEMPAGES_MIN)


Re: "zfs send" freezes system (was: Re: pgdaemon high CPU consumption)

2022-07-19 Thread Brad Spencer
Matthias Petermann  writes:

[snip]

> Roundabout one week after removing the patch, my system with ZFS is 
> behaving "normally" for the most part and the freezes have disappeared. 
> What is the recommended way given the 10 branch? If it is not 
> foreseeable that the basic problem can be solved shortly, would it also 
> be an option to withdraw the patch in the sources to get at least a 
> stable behavior? (Not only) on the sidelines, I would still be 
> interested in whether this "zfs send" problem occurs in general, or 
> whether certain hardware requirements have a favorable effect on it.
>
> Kind regards
> Matthias


My personal experience with this problem is with Xen PV/PVH guests on my
build system(s), but I think others have experienced it with physical
systems.  The only particular facts that I have observed are: a) If
there is more memory given to the guest they last longer.  That is the
one with 8GB does not have as much trouble as the one with 4GB.  b)
reducing the number of vnodes makes helps keep the system up.  I usually
run 4096 to 16384 kern.maxvnodes.  Without this the OS build system can
do about 1.5 "build.sh release" runs before it hangs and with a
reduction in vnodes it can get away with 4 to 6 before there are
problems.  c) Seen only once and on -currentish (most systems are 9.2),
but a zfs receive into a compressed fileset hung after a while (in the
usual manor that I observe) even with the mentioned patch reverted and a
reduced maxvnodes.  Receiving into a non-compressed fileset worked as
expected.

The above a, b and c was 100% reproduceable for me, just takes a while as
a build release isn't quite.. the zfs receive thing happened much
faster.

Both of my build systems uses zfs send through a ssh pipe to a file on
another system as a backup method and both have compressed filesets, but
they are rarely received into and never sent from those.  I have had no
trouble zfs sending and receiving locally from a non-compressed fileset
into a compressed one, which makes that one case a little strange.

On a -currentish test system with zfs filesets for object artifacts and
build artifacts if I didn't revert the mentioned patch to arc.c the
system could not make it though a build of qemu from pkgsrc and maybe
not even though unpacking the source for building (sorry, don't exactly
remember how far it would get).




-- 
Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org


"zfs send" freezes system (was: Re: pgdaemon high CPU consumption)

2022-07-19 Thread Matthias Petermann

Hello,

On 13.07.22 12:30, Matthias Petermann wrote:

I can now confirm that reverting the patch also solved my problem. Of 
course I first fell into the trap, because I had not considered that the 
ZFS code is loaded as a module and had only changed the kernel. As a 
result, it looked at first as if this would not help. Finally it did...I 
am now glad that I can use a zfs send again in this way. This previously 
led reproducibly to a crash, whereby I could not make backups. This is 
critical for me and I would like to support tests regarding this.


In contrast to the PR, there are hardly any xcalls in my use case - 
however, my system only has 4 CPU cores, 2 of which are physical.



Many greetings
Matthias



Roundabout one week after removing the patch, my system with ZFS is 
behaving "normally" for the most part and the freezes have disappeared. 
What is the recommended way given the 10 branch? If it is not 
foreseeable that the basic problem can be solved shortly, would it also 
be an option to withdraw the patch in the sources to get at least a 
stable behavior? (Not only) on the sidelines, I would still be 
interested in whether this "zfs send" problem occurs in general, or 
whether certain hardware requirements have a favorable effect on it.


Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: pgdaemon high CPU consumption

2022-07-13 Thread Matthias Petermann

Hello,

On 10.07.22 19:14, Matthias Petermann wrote:
thanks for this reference... it matches pretty much my observations. I 
did a lot of attempts to tune maxvnodes during the last days, but the 
pgdaemon issue remained. Ultimately I suspect it is also responsible for 
the reproducible system lock-ups during ZFS send.


I am about to revert the patch from the PR above on my system and re-try.

Kind regards
Matthias



I can now confirm that reverting the patch also solved my problem. Of 
course I first fell into the trap, because I had not considered that the 
ZFS code is loaded as a module and had only changed the kernel. As a 
result, it looked at first as if this would not help. Finally it did...I 
am now glad that I can use a zfs send again in this way. This previously 
led reproducibly to a crash, whereby I could not make backups. This is 
critical for me and I would like to support tests regarding this.


In contrast to the PR, there are hardly any xcalls in my use case - 
however, my system only has 4 CPU cores, 2 of which are physical.



Many greetings
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: pgdaemon high CPU consumption

2022-07-10 Thread Matthias Petermann

Hello Frank,

On 01.07.22 14:07, Frank Kardel wrote:

Hi Matthias !

See PR 55707 
http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=55707 , which 
I do not considere fixed due to the pgdaemon issue. reverting arc.cto 
1.20 will give you many xcalls, but the system stays more usable.


Frank



thanks for this reference... it matches pretty much my observations. I 
did a lot of attempts to tune maxvnodes during the last days, but the 
pgdaemon issue remained. Ultimately I suspect it is also responsible for 
the reproducible system lock-ups during ZFS send.


I am about to revert the patch from the PR above on my system and re-try.

Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: pgdaemon high CPU consumption

2022-07-03 Thread Brad Spencer
Matthias Petermann  writes:

> Hello,
>
> On 01.07.22 12:48, Brad Spencer wrote:
>> "J. Hannken-Illjes"  writes:
>> 
 On 1. Jul 2022, at 07:55, Matthias Petermann  wrote:

 Good day,

 since some time I noticed that on several of my systems with NetBSD/amd64 
 9.99.97/98 after longer usage the kernel process pgdaemon completely 
 claims a CPU core for itself, i.e. constantly consumes 100%.
 The affected systems do not have a shortage of RAM and the problem does 
 not disappear even if all workloads are stopped, and thus no RAM is 
 actually used by application processes.

 I noticed this especially in connection with accesses to the ZFS set up on 
 the respective machines - for example after checkout from the local CVS 
 relic hosted on ZFS.

 Is there already a known problem or what information would have to be 
 collected to get to the bottom of this?

 I currently have such a case online, so I would be happy to pull 
 diagnostic information this evening/afternoon. At the moment all info I 
 have is from top.

 Normal view:

 ```
   PID USERNAME PRI NICE   SIZE   RES STATE   TIME   WCPUCPU COMMAND
 0 root 1260 0K   34M CPU/0 102:45   100%   100% 
 [system]
 ```

 Thread view:


 ```
   PID   LID USERNAME PRI STATE   TIME   WCPUCPU NAME  COMMAND
 0   173 root 126 CPU/1  96:57 98.93% 98.93% pgdaemon  [system]
 ```
>>>
>>> Looks a lot like kern/55707: ZFS seems to trigger a lot of xcalls
>>>
>>> Last action proposed was to back out the patch ...
>>>
>>> --
>>> J. Hannken-Illjes - hann...@mailbox.org
>> 
>> 
>> Probably only a slightly related data point, but Ya, if you have a
>> system / VM / Xen PV that does not have a whole lot of RAM and if you
>> don't back out that patch your system will become unusable in a very
>> short order if you do much at all with ZFS (tested with a recent
>> -current building pkgsrc packages on a Xen PVHVM).  The patch does fix a
>> real bug, as NetBSD doesn't have the define that it uses, but the effect
>> of running that code will be needed if you use ZFS at all on a "low" RAM
>> system.  I personally suspect that the ZFS ARC or some pool is allowed
>> to consume nearly all available "something" (pools, RAM, etc..) without
>> limit but have no specific proof (or there is a leak somewhere).  I
>> mostly run 9.x ZFS right now (which may have other problems), and have
>> been setting maxvnodes way down for some time.  If I don't do that the
>> Xen PV will hang itself up after a couple of 'build.sh release' runs
>> when the source and build artifacts are on ZFS filesets.
>
> Thanks for describing this use case. Apart from the fact that I don't 
> currently use Xen on the affected machine, it performs similiar 
> workload. I use it as pbulk builder with distfiles, build artifacts and 
> CVS / Git mirror stored on ZFS. The builders themself are located in 
> chroot sandboxes on FFS. Anyway, I can trigger the observations by doing 
> a NetBSD src checkout from ZFS backed CVS to the FFS partition.
>
> The maxvnodes trick first led to pgdaemon behave normal again, but the 
> system freezed shortly after with no further evidence.
>
> I am not sure if this thread is the right one for pointing this out, but 
> I experienced further issues with NetBSD current and ZFS when I tried to 
> perform a recursive "zfs send" of a particular snapshot of my data sets. 
> After it initially works, I see the system freeze after a couple of 
> seconds with no chance to recover (could not even enter the kernel 
> debugger). I will come back and need to prepare a dedicated test VM for 
> my cases.
>
> Kind regards
> Matthias


I saw something like that with a "zfs send..." and "zfs receive..."
locking up just one time.  I do that sort of thing fairly often to move
filesets between one system and another and it has worked fine for me,
except in one case...  the destination was a NetBSD-current with a ZFS
fileset set to use compression.  The source is a FreeBSD with a ZFS
fileset created in such a manor that NetBSD is happy with it and it also
is set to use compression.  No amount of messing around would let 'zfs
send  | ssh destination "zfs receive "' complete without
locking up the destination.  When I changed the destination to not use
compression I was able to perform the zfs send / receive pipeline
without any problems.  The destination is a pretty recent -current Xen
PVHVM guest and the source is a FreeBSD 12.1 (running minio to back up
my Elasticsearch cluster).



-- 
Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org



Re: pgdaemon high CPU consumption

2022-07-03 Thread Matthias Petermann

Hello,

On 01.07.22 12:48, Brad Spencer wrote:

"J. Hannken-Illjes"  writes:


On 1. Jul 2022, at 07:55, Matthias Petermann  wrote:

Good day,

since some time I noticed that on several of my systems with NetBSD/amd64 
9.99.97/98 after longer usage the kernel process pgdaemon completely claims a 
CPU core for itself, i.e. constantly consumes 100%.
The affected systems do not have a shortage of RAM and the problem does not 
disappear even if all workloads are stopped, and thus no RAM is actually used 
by application processes.

I noticed this especially in connection with accesses to the ZFS set up on the 
respective machines - for example after checkout from the local CVS relic 
hosted on ZFS.

Is there already a known problem or what information would have to be collected 
to get to the bottom of this?

I currently have such a case online, so I would be happy to pull diagnostic 
information this evening/afternoon. At the moment all info I have is from top.

Normal view:

```
  PID USERNAME PRI NICE   SIZE   RES STATE   TIME   WCPUCPU COMMAND
0 root 1260 0K   34M CPU/0 102:45   100%   100% [system]
```

Thread view:


```
  PID   LID USERNAME PRI STATE   TIME   WCPUCPU NAME  COMMAND
0   173 root 126 CPU/1  96:57 98.93% 98.93% pgdaemon  [system]
```


Looks a lot like kern/55707: ZFS seems to trigger a lot of xcalls

Last action proposed was to back out the patch ...

--
J. Hannken-Illjes - hann...@mailbox.org



Probably only a slightly related data point, but Ya, if you have a
system / VM / Xen PV that does not have a whole lot of RAM and if you
don't back out that patch your system will become unusable in a very
short order if you do much at all with ZFS (tested with a recent
-current building pkgsrc packages on a Xen PVHVM).  The patch does fix a
real bug, as NetBSD doesn't have the define that it uses, but the effect
of running that code will be needed if you use ZFS at all on a "low" RAM
system.  I personally suspect that the ZFS ARC or some pool is allowed
to consume nearly all available "something" (pools, RAM, etc..) without
limit but have no specific proof (or there is a leak somewhere).  I
mostly run 9.x ZFS right now (which may have other problems), and have
been setting maxvnodes way down for some time.  If I don't do that the
Xen PV will hang itself up after a couple of 'build.sh release' runs
when the source and build artifacts are on ZFS filesets.


Thanks for describing this use case. Apart from the fact that I don't 
currently use Xen on the affected machine, it performs similiar 
workload. I use it as pbulk builder with distfiles, build artifacts and 
CVS / Git mirror stored on ZFS. The builders themself are located in 
chroot sandboxes on FFS. Anyway, I can trigger the observations by doing 
a NetBSD src checkout from ZFS backed CVS to the FFS partition.


The maxvnodes trick first led to pgdaemon behave normal again, but the 
system freezed shortly after with no further evidence.


I am not sure if this thread is the right one for pointing this out, but 
I experienced further issues with NetBSD current and ZFS when I tried to 
perform a recursive "zfs send" of a particular snapshot of my data sets. 
After it initially works, I see the system freeze after a couple of 
seconds with no chance to recover (could not even enter the kernel 
debugger). I will come back and need to prepare a dedicated test VM for 
my cases.


Kind regards
Matthias



smime.p7s
Description: S/MIME Cryptographic Signature


Re: pgdaemon high CPU consumption

2022-07-01 Thread Frank Kardel

Hi Matthias !

See PR 55707 
http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=55707 , which 
I do not considere fixed due to the pgdaemon issue. reverting arc.cto 
1.20 will give you many xcalls, but the system stays more usable.


Frank


On 07/01/22 07:55, Matthias Petermann wrote:

Good day,

since some time I noticed that on several of my systems with 
NetBSD/amd64 9.99.97/98 after longer usage the kernel process pgdaemon 
completely claims a CPU core for itself, i.e. constantly consumes 100%.
The affected systems do not have a shortage of RAM and the problem 
does not disappear even if all workloads are stopped, and thus no RAM 
is actually used by application processes.


I noticed this especially in connection with accesses to the ZFS set 
up on the respective machines - for example after checkout from the 
local CVS relic hosted on ZFS.


Is there already a known problem or what information would have to be 
collected to get to the bottom of this?


I currently have such a case online, so I would be happy to pull 
diagnostic information this evening/afternoon. At the moment all info 
I have is from top.


Normal view:

```
  PID USERNAME PRI NICE   SIZE   RES STATE   TIME   WCPU CPU COMMAND
0 root 1260 0K   34M CPU/0 102:45   100% 100% 
[system]

```

Thread view:


```
  PID   LID USERNAME PRI STATE   TIME   WCPUCPU NAME COMMAND
0   173 root 126 CPU/1  96:57 98.93% 98.93% pgdaemon [system]
```

Kind regards
Matthias





Re: pgdaemon high CPU consumption

2022-07-01 Thread Brad Spencer
"J. Hannken-Illjes"  writes:

>> On 1. Jul 2022, at 07:55, Matthias Petermann  wrote:
>> 
>> Good day,
>> 
>> since some time I noticed that on several of my systems with NetBSD/amd64 
>> 9.99.97/98 after longer usage the kernel process pgdaemon completely claims 
>> a CPU core for itself, i.e. constantly consumes 100%.
>> The affected systems do not have a shortage of RAM and the problem does not 
>> disappear even if all workloads are stopped, and thus no RAM is actually 
>> used by application processes.
>> 
>> I noticed this especially in connection with accesses to the ZFS set up on 
>> the respective machines - for example after checkout from the local CVS 
>> relic hosted on ZFS.
>> 
>> Is there already a known problem or what information would have to be 
>> collected to get to the bottom of this?
>> 
>> I currently have such a case online, so I would be happy to pull diagnostic 
>> information this evening/afternoon. At the moment all info I have is from 
>> top.
>> 
>> Normal view:
>> 
>> ```
>>  PID USERNAME PRI NICE   SIZE   RES STATE   TIME   WCPUCPU COMMAND
>>0 root 1260 0K   34M CPU/0 102:45   100%   100% [system]
>> ```
>> 
>> Thread view:
>> 
>> 
>> ```
>>  PID   LID USERNAME PRI STATE   TIME   WCPUCPU NAME  COMMAND
>>0   173 root 126 CPU/1  96:57 98.93% 98.93% pgdaemon  [system]
>> ```
>
> Looks a lot like kern/55707: ZFS seems to trigger a lot of xcalls
>
> Last action proposed was to back out the patch ...
>
> --
> J. Hannken-Illjes - hann...@mailbox.org


Probably only a slightly related data point, but Ya, if you have a
system / VM / Xen PV that does not have a whole lot of RAM and if you
don't back out that patch your system will become unusable in a very
short order if you do much at all with ZFS (tested with a recent
-current building pkgsrc packages on a Xen PVHVM).  The patch does fix a
real bug, as NetBSD doesn't have the define that it uses, but the effect
of running that code will be needed if you use ZFS at all on a "low" RAM
system.  I personally suspect that the ZFS ARC or some pool is allowed
to consume nearly all available "something" (pools, RAM, etc..) without
limit but have no specific proof (or there is a leak somewhere).  I
mostly run 9.x ZFS right now (which may have other problems), and have
been setting maxvnodes way down for some time.  If I don't do that the
Xen PV will hang itself up after a couple of 'build.sh release' runs
when the source and build artifacts are on ZFS filesets.



-- 
Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org


Re: pgdaemon high CPU consumption

2022-07-01 Thread J. Hannken-Illjes
> On 1. Jul 2022, at 07:55, Matthias Petermann  wrote:
> 
> Good day,
> 
> since some time I noticed that on several of my systems with NetBSD/amd64 
> 9.99.97/98 after longer usage the kernel process pgdaemon completely claims a 
> CPU core for itself, i.e. constantly consumes 100%.
> The affected systems do not have a shortage of RAM and the problem does not 
> disappear even if all workloads are stopped, and thus no RAM is actually used 
> by application processes.
> 
> I noticed this especially in connection with accesses to the ZFS set up on 
> the respective machines - for example after checkout from the local CVS relic 
> hosted on ZFS.
> 
> Is there already a known problem or what information would have to be 
> collected to get to the bottom of this?
> 
> I currently have such a case online, so I would be happy to pull diagnostic 
> information this evening/afternoon. At the moment all info I have is from top.
> 
> Normal view:
> 
> ```
>  PID USERNAME PRI NICE   SIZE   RES STATE   TIME   WCPUCPU COMMAND
>0 root 1260 0K   34M CPU/0 102:45   100%   100% [system]
> ```
> 
> Thread view:
> 
> 
> ```
>  PID   LID USERNAME PRI STATE   TIME   WCPUCPU NAME  COMMAND
>0   173 root 126 CPU/1  96:57 98.93% 98.93% pgdaemon  [system]
> ```

Looks a lot like kern/55707: ZFS seems to trigger a lot of xcalls

Last action proposed was to back out the patch ...

--
J. Hannken-Illjes - hann...@mailbox.org


signature.asc
Description: Message signed with OpenPGP


Re: pgdaemon high CPU consumption

2022-07-01 Thread Michael van Elst
m...@petermann-it.de (Matthias Petermann) writes:

>since some time I noticed that on several of my systems with=20
>NetBSD/amd64 9.99.97/98 after longer usage the kernel process pgdaemon=20
>completely claims a CPU core for itself, i.e. constantly consumes 100%.
>The affected systems do not have a shortage of RAM and the problem does=20
>not disappear even if all workloads are stopped, and thus no RAM is=20
>actually used by application processes.

There is a shortage, either free RAM pages or kernel address space.

The page daemon gets triggered, but if it cannot resolve the situation
it will just spin until it succeeds.


>I noticed this especially in connection with accesses to the ZFS set up=20
>on the respective machines - for example after checkout from the local=20
>CVS relic hosted on ZFS.

Resource exhaustion could be caused by ZFS, but also something else.

If you can still operate the system, a common workaround is to
reduce kern.maxvnodes with sysctl (and bump it up later).

If the system is not responding but you can enter DDB, setting
the kernel variable desiredvnodes does the same.