Hello all!

Thanks for the many hints to look for. We did some tuning and further debugging and here are the outcomes, answering all questions in a single email.


In the meantime, you could experiment with setting checkpoint_flush_after to 0
We did this:
# SHOW checkpoint_flush_after;
 checkpoint_flush_after
------------------------
 0
(1 row)

But we STILL have PANICs. I tried to understand the code but failed. I guess that there are some code paths which call pg_flush_data() without checking this settings, or the check does not work.



Did this start after upgrading to 22.04? Or after a certain kernel upgrade?

For sure it only started with Ubuntu 22.04. We did not had and still not have any issues on servers with Ubuntu 20.04 and 18.04.


I would believe that the kernel would raise
a bunch of printks if it hit ENOMEM in the commonly used paths, so
you would see something in dmesg or wherever you collect your kernel
log if it happened where it was expected.

There is nothing in the kernel logs (dmesg)


Do you use cgroups or such to limit memory usage of postgres?

No


Any uncommon options on the filesystem or the mount point?
No. Also no Antivirus:
/dev/xvda2 / ext4 noatime,nodiratime,errors=remount-ro 0 1
or
LABEL=cloudimg-rootfs / ext4 discard,errors=remount-ro 0 1


does this happen on all the hosts, or is it limited to one host or one technology?

It happens on XEN VMs, KVM VMs and VMware VMs. On Intel and AMD plattforms.


Another interesting thing would be to know the mount and file system options
for the FS that triggers the failures. E.g.

# tune2fs -l /dev/sda1
tune2fs 1.46.5 (30-Dec-2021)
Filesystem volume name:   cloudimg-rootfs
Last mounted on:          /
Filesystem UUID:          0522e6b3-8d40-4754-a87e-5678a6921e37
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg encrypt sparse_super large_file huge_file dir_nlink extra_isize metadata_csum
Filesystem flags:         signed_directory_hash
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              12902400
Block count:              26185979
Reserved block count:     0
Overhead clusters:        35096
Free blocks:              18451033
Free inodes:              12789946
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Reserved GDT blocks:      243
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         16128
Inode blocks per group:   1008
Flex block group size:    16
Filesystem created:       Wed Apr 20 18:31:24 2022
Last mount time:          Thu Nov 10 09:49:34 2022
Last write time:          Thu Nov 10 09:49:34 2022
Mount count:              7
Maximum mount count:      -1
Last checked:             Wed Apr 20 18:31:24 2022
Check interval:           0 (<none>)
Lifetime writes:          252 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               256
Required extra isize:     32
Desired extra isize:      32
Journal inode:            8
First orphan inode:       42571
Default directory hash:   half_md4
Directory Hash Seed:      c5ef129b-fbee-4f35-8f28-ad7cc93c1c43
Journal backup:           inode blocks
Checksum type:            crc32c
Checksum:                 0xb74ebbc3


Thanks
Klaus



Reply via email to