Re: Banana Pi-R1 - kernel 5.6.0 and later broken - b43 DSA
Hello Florian, On 20.06.2020 21:13, Florian Fainelli wrote: Hi, On 6/20/2020 10:39 AM, Gerhard Wiesinger wrote: Can you share your network configuration again with me? Find the network config below. # OK: Last good known version (booting that version is also ok) Linux bpi 5.5.18-200.fc31.armv7hl #1 SMP Fri Apr 17 17:25:00 UTC 2020 armv7l armv7l armv7l GNU/Linux OK, I suspect what has changed is this commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8fab459e69abfd04a66d76423d18ba853fced4ab which, if I remember your configuration correctly means that you now have proper DSA interfaces and so all of the wan, lan1-lan4 interfaces can now run a proper networking stack unlike before where you had to do this via the DSA master network device (eth0). This also means that you should run your DHCP server/client on the bridge master network device now. Yes, config should be a proper DSA config, see also below. IP config is only on brlan and brwan. Do the changes need a config change? e.g. net: dsa: b53: Ensure the default VID is untagged https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/dsa/b53?id=d965a5432d4c3e6b9c3d2bc1d4a800013bbf76f6 Thnx. Ciao, Gerhard Config: brctl show bridge name bridge id STP enabled interfaces brlan 8000. no lan.101 brlansw 8000. no lan1 lan2 lan3 lan4 brwan 8000. no wan.102 brwansw 8000. no wan = /etc/systemd/network/30-autogen-eth0.network [Match] Name=eth0 [Network] VLAN=lan.101 VLAN=wan.102 = /etc/systemd/network/40-autogen-lan.101.netdev [NetDev] Name=lan.101 Kind=vlan [VLAN] Id=101 = /etc/systemd/network/40-autogen-wan.102.netdev [NetDev] Name=wan.102 Kind=vlan [VLAN] Id=102 = /etc/systemd/network/50-autogen-brlan.netdev [NetDev] Name=brlan Kind=bridge [Bridge] DefaultPVID=none VLANFiltering=false STP=false = /etc/systemd/network/50-autogen-brlansw.netdev [NetDev] Name=brlansw Kind=bridge [Bridge] DefaultPVID=none VLANFiltering=true STP=false = /etc/systemd/network/50-autogen-brwan.netdev [NetDev] Name=brwan Kind=bridge [Bridge] DefaultPVID=none VLANFiltering=false STP=false = /etc/systemd/network/50-autogen-brwansw.netdev [NetDev] Name=brwansw Kind=bridge [Bridge] DefaultPVID=none VLANFiltering=true STP=false = /etc/systemd/network/60-autogen-brlan-lan.101.network [Match] Name=lan.101 [Network] Bridge=brlan = /etc/systemd/network/60-autogen-brlansw-lan1.network [Match] Name=lan1 [Network] Bridge=brlansw [BridgeVLAN] VLAN=101 EgressUntagged=101 PVID=101 = /etc/systemd/network/60-autogen-brlansw-lan2.network [Match] Name=lan2 [Network] Bridge=brlansw [BridgeVLAN] VLAN=101 EgressUntagged=101 PVID=101 = /etc/systemd/network/60-autogen-brlansw-lan3.network [Match] Name=lan3 [Network] Bridge=brlansw [BridgeVLAN] VLAN=101 EgressUntagged=101 PVID=101
Banana Pi-R1 - kernel 5.6.0 and later broken - b43 DSA
Hello, I'm having troubles with the Banana Pi-R1 router with newer kernels. No config changes, config works well since a lot of lernel updates ... Banana Pi-R1 is configured via systemd-networkd and uses the DSA (Distributed Switch Architecture) with b53 switch. No visible difference in interfaces, vlan config, bridge config, etc. Looks like actual configuration in the switch in the hardware is broken. # OK: Last good known version (booting that version is also ok) Linux bpi 5.5.18-200.fc31.armv7hl #1 SMP Fri Apr 17 17:25:00 UTC 2020 armv7l armv7l armv7l GNU/Linux # NOK: no network Linux bpi 5.6.8-200.fc31.armv7hl #1 SMP Wed Apr 29 19:05:06 UTC 2020 armv7l armv7l armv7l GNU/Linux # NOK: no network Linux bpi 5.6.0-300.fc32.armv7hl #1 SMP Mon Mar 30 16:37:50 UTC 2020 armv7l armv7l armv7l GNU/Linux # NOK: no network Linux bpi 5.6.19-200.fc31.armv7hl #1 SMP Wed Jun 17 17:10:22 UTC 2020 armv7l armv7l armv7l GNU/Linux # NOK: no network Linux bpi 5.7.4-200.fc32.armv7hl #1 SMP Fri Jun 19 00:52:22 UTC 2020 armv7l armv7l armv7l GNU/Linux Saw that there were a lot of changes in the near past in the b53 driver: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/drivers/net/dsa/b53?h=v5.8-rc1+ Any ideas? Thnx. Ciao, Gerhard
Re: platform/x86/pcengines-apuv2: Missing apu4
On 29.07.2019 10:35, Enrico Weigelt, metux IT consult wrote: On 26.07.19 16:56, Gerhard Wiesinger wrote: Hello, I saw that the apu4 board is completly missing (also on 5.3rc1). Can you please add it. Should be very easy, see below. Still in the pipeline - don't have an apu4 board for testing yet. Delta to e.g. apu3 can be found in the repo, see below (https://github.com/pcengines/coreboot) dmidecode|grep -iE 'engines|apu' Manufacturer: PC Engines Product Name: apu4 Manufacturer: PC Engines Product Name: apu4 Manufacturer: PC Engines So risk of the patch is minimal. I can test it if patch is integrated. Ciao, Gerhard --- pcengines_apu3.config Fri Jul 26 11:33:41 2019 +++ pcengines_apu4.config Fri Jul 26 11:33:41 2019 @@ -30,14 +30,14 @@ # CONFIG_VENDOR_PCENGINES=y # CONFIG_BOARD_PCENGINES_APU2 is not set -CONFIG_BOARD_PCENGINES_APU3=y -# CONFIG_BOARD_PCENGINES_APU4 is not set +# CONFIG_BOARD_PCENGINES_APU3 is not set +CONFIG_BOARD_PCENGINES_APU4=y # CONFIG_BOARD_PCENGINES_APU5 is not set CONFIG_BOARD_SPECIFIC_OPTIONS=y -CONFIG_VARIANT_DIR="apu3" +CONFIG_VARIANT_DIR="apu4" CONFIG_DEVICETREE="variants/$(CONFIG_VARIANT_DIR)/devicetree.cb" CONFIG_MAINBOARD_DIR="pcengines/apu2" -CONFIG_MAINBOARD_PART_NUMBER="apu3" +CONFIG_MAINBOARD_PART_NUMBER="apu4" # CONFIG_SVI2_SLOW_SPEED is not set CONFIG_SVI_WAIT_COMP_DIS=y CONFIG_HW_MEM_HOLE_SIZEK=0x20 @@ -397,7 +397,7 @@ CONFIG_MAINBOARD_SERIAL_NUMBER="123456789" CONFIG_MAINBOARD_VERSION="1.0" CONFIG_MAINBOARD_SMBIOS_MANUFACTURER="PC Engines" -CONFIG_MAINBOARD_SMBIOS_PRODUCT_NAME="apu3" +CONFIG_MAINBOARD_SMBIOS_PRODUCT_NAME="apu4" # # Payload
platform/x86/pcengines-apuv2: Missing apu4
Hello, I saw that the apu4 board is completly missing (also on 5.3rc1). Can you please add it. Should be very easy, see below. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/platform/x86/pcengines-apuv2.c?h=v5.1.20 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/platform/x86/pcengines-apuv2.c?h=v5.3-rc1 For further rference: https://www.pcengines.ch/apu2.htm https://www.pcengines.ch/apu4c2.htm https://www.pcengines.ch/apu4c4.htm Please backport it also to 5.1.x and 5.2.x. Thnx. Ciao, Gerhard /* APU4 w/ legacy bios < 4.0.8 */ { .ident = "apu4", .matches = { DMI_MATCH(DMI_SYS_VENDOR, "PC Engines"), DMI_MATCH(DMI_BOARD_NAME, "APU4") }, .driver_data = (void *)_apu2, }, /* APU4 w/ legacy bios >= 4.0.8 */ { .ident = "apu4", .matches = { DMI_MATCH(DMI_SYS_VENDOR, "PC Engines"), DMI_MATCH(DMI_BOARD_NAME, "apu4) }, .driver_data = (void *)_apu2, }, /* APU4 w/ mainline bios */ { .ident = "apu4", .matches = { DMI_MATCH(DMI_SYS_VENDOR, "PC Engines"), DMI_MATCH(DMI_BOARD_NAME, "PC Engines apu4") }, .driver_data = (void *)_apu2, }, | MODULE_DESCRIPTION("PC Engines APUv2/APUv3/APUv4 board GPIO/LED/keys driver"); |
Re: Banana Pi-R1 stabil
On 07.03.2019 16:31, Maxime Ripard wrote: On Wed, Mar 06, 2019 at 09:03:00PM +0100, Gerhard Wiesinger wrote: while true; do echo ""; echo -n "CPU_FREQ0: "; cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq; echo -n "CPU_FREQ1: "; cat /sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_cur_freq; sleep 1; done& ./stress/cpuburn-a7 Run cpufreq-ljt-stress-test On ALL Banana Pi R1 I have and Banana Pro OK, see below Ciao, Gerhard ./cpufreq-ljt-stress-test The cjpeg and djpeg tools from libjpeg-turbo are not found. Trying to download and compile them. Downloading libjpeg-turbo-1.3.1.tar.gz ... done Extracting libjpeg-turbo-1.3.1.tar.gz ... done Compiling libjpeg-turbo, please be patient ... done Creating './whitenoise-1920x1080.jpg' ... done CPU stress test, which is doing JPEG decoding by libjpeg-turbo at different cpufreq operating points. Testing CPU 0 960 MHz OK 912 MHz OK 864 MHz OK 720 MHz OK 528 MHz OK 312 MHz OK 144 MHz OK Testing CPU 1 960 MHz OK 912 MHz OK 864 MHz OK 720 MHz OK 528 MHz OK 312 MHz OK 144 MHz OK Overall result : PASSED
Re: Banana Pi-R1 stabil
On 06.03.2019 08:36, Maxime Ripard wrote: Yes, there might at least 2 scenarios: 1.) Frequency switching itself is the problem But that code is also the one being used by the BananaPro, which you reported as stable. Yes, BananaPro is stable (with exactly same configuration as far as I know) 2.) lower frequency/voltage operating points are not stable. For both scenarios: it might be possible that the crash happens on idle CPU, high CPU load or just randomly. Therefore just "waiting" might be better than 100% CPU utilization.But will test also 100% CPU. Therefore it would be good to see where the voltages for different frequencies for the SoC are defined (to compare). In the device tree. I'm currently testing 2 different settings on the 2 new Banana Pi R1 with newest kernel (see below), so 2 static frequencies: # Set to specific frequency 144000 (currently testing on Banana Pi R1 #1) # Set to specific frequency 312000 (currently testing on Banana Pi R1 #2) If that's fine I'll test also further frequencies (with different loads). Look, you can come up with whatever program you want for this, but if I insist on running that cpustress program (for the 4th time now), is that it's actually good at it and caught all the cpufreq issues we've seen so far. As I wrote, I run several stress tests also with the program you mention. But test combination require a minimum testing time to get verifiable results. The combinations are: - idle cpu vs. 100% CPU - on demand governor vs. several fixed frequencies. So far stable testing conditions for idle CPU and 100% CPU with command line below and cpuburn-a7 program: # Set to max performance (stable)=> frequency 96 # Set to specific frequency 144000 (stable) # Set to specific frequency 312000 (stable) TODO list to test with "idle" CPU and 100% CPU: # Set to specific frequency 528000 (next step tested) # Set to specific frequency 72 (next step tested) # Set to specific frequency 864000 # Set to specific frequency 912000 # Set to ondemand My guess is (but it is just a guess which has to be verified): - stable in all fixed frequencies in idle CPU and 100% CPU condition as well as on demand and 100% CPU - not stable with ondemand and "idle" CPU (so real frequency switching will happen often) Feel free to not trust me on this, but I'm not sure how the discussion can continue if you do. You missed my point from my previous mail: "But will test also 100% CPU.". See command line below. Ciao, Gerhard Test script: while true; do echo ""; echo -n "CPU_FREQ0: "; cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq; echo -n "CPU_FREQ1: "; cat /sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_cur_freq; sleep 1; done& ./stress/cpuburn-a7
Re: Banana Pi-R1 stabil
On 05.03.2019 10:28, Maxime Ripard wrote: On Sat, Mar 02, 2019 at 09:42:08AM +0100, Gerhard Wiesinger wrote: On 01.03.2019 10:30, Maxime Ripard wrote: On Thu, Feb 28, 2019 at 08:41:53PM +0100, Gerhard Wiesinger wrote: On 28.02.2019 10:35, Maxime Ripard wrote: On Wed, Feb 27, 2019 at 07:58:14PM +0100, Gerhard Wiesinger wrote: On 27.02.2019 10:20, Maxime Ripard wrote: On Sun, Feb 24, 2019 at 09:04:57AM +0100, Gerhard Wiesinger wrote: Hello, I've 3 Banana Pi R1, one running with self compiled kernel 4.7.4-200.BPiR1.fc24.armv7hl and old Fedora 25 which is VERY STABLE, the 2 others are running with Fedora 29 latest, kernel 4.20.10-200.fc29.armv7hl. I tried a lot of kernels between of around 4.11 (kernel-4.11.10-200.fc25.armv7hl) until 4.20.10 but all had crashes without any output on the serial console or kernel panics after a short time of period (minutes, hours, max. days) Latest known working and stable self compiled kernel: kernel 4.7.4-200.BPiR1.fc24.armv7hl: https://www.wiesinger.com/opensource/fedora/kernel/BananaPi-R1/ With 4.8.x the DSA b53 switch infrastructure has been introduced which didn't work (until ca8931948344c485569b04821d1f6bcebccd376b and kernel 4.18.x): https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/dsa/b53?h=v4.20.12 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/dsa/b53?h=v4.20.12 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/net/dsa/b53?h=v4.20.12=ca8931948344c485569b04821d1f6bcebccd376b I has been fixed with kernel 4.18.x: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/dsa/b53?h=linux-4.18.y So current status is, that kernel crashes regularly, see some samples below. It is typically a "Unable to handle kernel paging request at virtual addres" Another interesting thing: A Banana Pro works well (which has also an Allwinner A20 in the same revision) running same Fedora 29 and latest kernels (e.g. kernel 4.20.10-200.fc29.armv7hl.). Since it happens on 2 different devices and with different power supplies (all with enough power) and also the same type which works well on the working old kernel) a hardware issue is very unlikely. I guess it has something to do with virtual memory. Any ideas? [47322.960193] Unable to handle kernel paging request at virtual addres 5675d0 That line is a bit suspicious Anyway, cpufreq is known to cause those kind of errors when the voltage / frequency association is not correct. Given the stack trace and that the BananaPro doesn't have cpufreq enabled, my first guess would be that it's what's happening. Could you try using the performance governor and see if it's more stable? If it is, then using this: https://github.com/ssvb/cpuburn-arm/blob/master/cpufreq-ljt-stress-test will help you find the offending voltage-frequency couple. For me it looks like they have all the same config regarding cpu governor (Banana Pro, old kernel stable one, new kernel unstable ones) The Banana Pro doesn't have a regulator set up, so it will only change the frequency, not the voltage. They all have the ondemand governor set: I set on the 2 unstable "new kernel Banana Pi R1": # Set to max performance echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor What are the results? Stable since more than around 1,5 days. Normally they have been crashed for such a long uptime. So it looks that the performance governor fixes it. I guess crashes occour because of changing CPU voltage and clock changes and invalid data (e.g. also invalid RAM contents might be read, register problems, etc). Any ideas how to fix it for ondemand mode, too? Run https://github.com/ssvb/cpuburn-arm/blob/master/cpufreq-ljt-stress-test But it doesn't explaing that it works with kernel 4.7.4 without any problems. My best guess would be that cpufreq wasn't enabled at that time, or without voltage scaling. Where can I see the voltage scaling parameters? on DTS I don't see any difference between kernel 4.7.4 and 4.20.10 regarding voltage: dtc -I dtb -O dts -o /boot/dtb-4.20.10-200.fc29.armv7hl/sun7i-a20-lamobo-r1.dts /boot/dtb-4.20.10-200.fc29.armv7hl/sun7i-a20-lamobo-r1.dtb This can be also due to configuration being changed, driver support, etc. Where will the voltages for scaling then be set in detail (drivers, etc.)? There is another strange thing (tested with kernel-5.0.0-0.rc8.git1.1.fc31.armv7hl, kernel-4.19.8-300.fc29.armv7hl, kernel-4.20.13-200.fc29.armv7hl, kernel-4.20.10-200.fc29.armv7hl): There is ALWAYS high CPU of around 10% in kworker: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 18722 root 20 0 0 0 0 I 9.5 0.0 0:47.52 [kworker/1:3-events_freezable_power_] PID USER PR NI VIRT RES SHR S %CPU %MEM
Re: Banana Pi-R1 stabil
On 01.03.2019 10:30, Maxime Ripard wrote: On Thu, Feb 28, 2019 at 08:41:53PM +0100, Gerhard Wiesinger wrote: On 28.02.2019 10:35, Maxime Ripard wrote: On Wed, Feb 27, 2019 at 07:58:14PM +0100, Gerhard Wiesinger wrote: On 27.02.2019 10:20, Maxime Ripard wrote: On Sun, Feb 24, 2019 at 09:04:57AM +0100, Gerhard Wiesinger wrote: Hello, I've 3 Banana Pi R1, one running with self compiled kernel 4.7.4-200.BPiR1.fc24.armv7hl and old Fedora 25 which is VERY STABLE, the 2 others are running with Fedora 29 latest, kernel 4.20.10-200.fc29.armv7hl. I tried a lot of kernels between of around 4.11 (kernel-4.11.10-200.fc25.armv7hl) until 4.20.10 but all had crashes without any output on the serial console or kernel panics after a short time of period (minutes, hours, max. days) Latest known working and stable self compiled kernel: kernel 4.7.4-200.BPiR1.fc24.armv7hl: https://www.wiesinger.com/opensource/fedora/kernel/BananaPi-R1/ With 4.8.x the DSA b53 switch infrastructure has been introduced which didn't work (until ca8931948344c485569b04821d1f6bcebccd376b and kernel 4.18.x): https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/dsa/b53?h=v4.20.12 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/dsa/b53?h=v4.20.12 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/net/dsa/b53?h=v4.20.12=ca8931948344c485569b04821d1f6bcebccd376b I has been fixed with kernel 4.18.x: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/dsa/b53?h=linux-4.18.y So current status is, that kernel crashes regularly, see some samples below. It is typically a "Unable to handle kernel paging request at virtual addres" Another interesting thing: A Banana Pro works well (which has also an Allwinner A20 in the same revision) running same Fedora 29 and latest kernels (e.g. kernel 4.20.10-200.fc29.armv7hl.). Since it happens on 2 different devices and with different power supplies (all with enough power) and also the same type which works well on the working old kernel) a hardware issue is very unlikely. I guess it has something to do with virtual memory. Any ideas? [47322.960193] Unable to handle kernel paging request at virtual addres 5675d0 That line is a bit suspicious Anyway, cpufreq is known to cause those kind of errors when the voltage / frequency association is not correct. Given the stack trace and that the BananaPro doesn't have cpufreq enabled, my first guess would be that it's what's happening. Could you try using the performance governor and see if it's more stable? If it is, then using this: https://github.com/ssvb/cpuburn-arm/blob/master/cpufreq-ljt-stress-test will help you find the offending voltage-frequency couple. For me it looks like they have all the same config regarding cpu governor (Banana Pro, old kernel stable one, new kernel unstable ones) The Banana Pro doesn't have a regulator set up, so it will only change the frequency, not the voltage. They all have the ondemand governor set: I set on the 2 unstable "new kernel Banana Pi R1": # Set to max performance echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor What are the results? Stable since more than around 1,5 days. Normally they have been crashed for such a long uptime. So it looks that the performance governor fixes it. I guess crashes occour because of changing CPU voltage and clock changes and invalid data (e.g. also invalid RAM contents might be read, register problems, etc). Any ideas how to fix it for ondemand mode, too? Run https://github.com/ssvb/cpuburn-arm/blob/master/cpufreq-ljt-stress-test But it doesn't explaing that it works with kernel 4.7.4 without any problems. My best guess would be that cpufreq wasn't enabled at that time, or without voltage scaling. Where can I see the voltage scaling parameters? on DTS I don't see any difference between kernel 4.7.4 and 4.20.10 regarding voltage: dtc -I dtb -O dts -o /boot/dtb-4.20.10-200.fc29.armv7hl/sun7i-a20-lamobo-r1.dts /boot/dtb-4.20.10-200.fc29.armv7hl/sun7i-a20-lamobo-r1.dtb There is another strange thing (tested with kernel-5.0.0-0.rc8.git1.1.fc31.armv7hl, kernel-4.19.8-300.fc29.armv7hl, kernel-4.20.13-200.fc29.armv7hl, kernel-4.20.10-200.fc29.armv7hl): There is ALWAYS high CPU of around 10% in kworker: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 18722 root 20 0 0 0 0 I 9.5 0.0 0:47.52 [kworker/1:3-events_freezable_power_] PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 776 root 20 0 0 0 0 I 8.6 0.0 0:02.77 [kworker/0:4-events] Therefore CPU doesn't switch to low frequencies (see below). Any ideas? BTW: Still stable at aboout 2,5days on both devices. So solution
Re: Banana Pi-R1 stabil
On 28.02.2019 10:35, Maxime Ripard wrote: On Wed, Feb 27, 2019 at 07:58:14PM +0100, Gerhard Wiesinger wrote: On 27.02.2019 10:20, Maxime Ripard wrote: On Sun, Feb 24, 2019 at 09:04:57AM +0100, Gerhard Wiesinger wrote: Hello, I've 3 Banana Pi R1, one running with self compiled kernel 4.7.4-200.BPiR1.fc24.armv7hl and old Fedora 25 which is VERY STABLE, the 2 others are running with Fedora 29 latest, kernel 4.20.10-200.fc29.armv7hl. I tried a lot of kernels between of around 4.11 (kernel-4.11.10-200.fc25.armv7hl) until 4.20.10 but all had crashes without any output on the serial console or kernel panics after a short time of period (minutes, hours, max. days) Latest known working and stable self compiled kernel: kernel 4.7.4-200.BPiR1.fc24.armv7hl: https://www.wiesinger.com/opensource/fedora/kernel/BananaPi-R1/ With 4.8.x the DSA b53 switch infrastructure has been introduced which didn't work (until ca8931948344c485569b04821d1f6bcebccd376b and kernel 4.18.x): https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/dsa/b53?h=v4.20.12 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/dsa/b53?h=v4.20.12 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/net/dsa/b53?h=v4.20.12=ca8931948344c485569b04821d1f6bcebccd376b I has been fixed with kernel 4.18.x: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/dsa/b53?h=linux-4.18.y So current status is, that kernel crashes regularly, see some samples below. It is typically a "Unable to handle kernel paging request at virtual addres" Another interesting thing: A Banana Pro works well (which has also an Allwinner A20 in the same revision) running same Fedora 29 and latest kernels (e.g. kernel 4.20.10-200.fc29.armv7hl.). Since it happens on 2 different devices and with different power supplies (all with enough power) and also the same type which works well on the working old kernel) a hardware issue is very unlikely. I guess it has something to do with virtual memory. Any ideas? [47322.960193] Unable to handle kernel paging request at virtual addres 5675d0 That line is a bit suspicious Anyway, cpufreq is known to cause those kind of errors when the voltage / frequency association is not correct. Given the stack trace and that the BananaPro doesn't have cpufreq enabled, my first guess would be that it's what's happening. Could you try using the performance governor and see if it's more stable? If it is, then using this: https://github.com/ssvb/cpuburn-arm/blob/master/cpufreq-ljt-stress-test will help you find the offending voltage-frequency couple. For me it looks like they have all the same config regarding cpu governor (Banana Pro, old kernel stable one, new kernel unstable ones) The Banana Pro doesn't have a regulator set up, so it will only change the frequency, not the voltage. They all have the ondemand governor set: I set on the 2 unstable "new kernel Banana Pi R1": # Set to max performance echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor What are the results? Stable since more than around 1,5 days. Normally they have been crashed for such a long uptime. So it looks that the performance governor fixes it. I guess crashes occour because of changing CPU voltage and clock changes and invalid data (e.g. also invalid RAM contents might be read, register problems, etc). Any ideas how to fix it for ondemand mode, too? But it doesn't explaing that it works with kernel 4.7.4 without any problems. Running some stress tests are ok (I did that already in the past, but without setting maximum performance governor). Which stress tests have you been running? Now: while true; do echo ""; echo -n "TEMP : "; cat /sys/devices/virtual/thermal/thermal_zone0/temp; echo -n "VOLTAGE : "; cat /sys/devices/platform/soc@1c0/1c2ac00.i2c/i2c-0/0-0034/axp20x-ac-power-supply/power_supply/axp20x-ac/voltage_now; echo -n "CURRENT : "; cat /sys/devices/platform/soc@1c0/1c2ac00.i2c/i2c-0/0-0034/axp20x-ac-power-supply/power_supply/axp20x-ac/current_now; echo -n "CPU_FREQ0: "; cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq; echo -n "CPU_FREQ0: "; cat /sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_cur_freq; sleep 1; done& stress -c 4 -t 900s In the past also: while true; do echo ""; echo -n "TEMP : "; cat /sys/devices/virtual/thermal/thermal_zone0/temp; echo -n "VOLTAGE : "; cat /sys/devices/platform/soc@1c0/1c2ac00.i2c/i2c-0/0-0034/axp20x-ac-power-supply/power_supply/axp20x-ac/voltage_now; echo -n "CURRENT : "; cat /sys/devices/platform/soc@1c0
Re: Banana Pi-R1 stabil
On 27.02.2019 10:20, Maxime Ripard wrote: On Sun, Feb 24, 2019 at 09:04:57AM +0100, Gerhard Wiesinger wrote: Hello, I've 3 Banana Pi R1, one running with self compiled kernel 4.7.4-200.BPiR1.fc24.armv7hl and old Fedora 25 which is VERY STABLE, the 2 others are running with Fedora 29 latest, kernel 4.20.10-200.fc29.armv7hl. I tried a lot of kernels between of around 4.11 (kernel-4.11.10-200.fc25.armv7hl) until 4.20.10 but all had crashes without any output on the serial console or kernel panics after a short time of period (minutes, hours, max. days) Latest known working and stable self compiled kernel: kernel 4.7.4-200.BPiR1.fc24.armv7hl: https://www.wiesinger.com/opensource/fedora/kernel/BananaPi-R1/ With 4.8.x the DSA b53 switch infrastructure has been introduced which didn't work (until ca8931948344c485569b04821d1f6bcebccd376b and kernel 4.18.x): https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/dsa/b53?h=v4.20.12 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/dsa/b53?h=v4.20.12 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/net/dsa/b53?h=v4.20.12=ca8931948344c485569b04821d1f6bcebccd376b I has been fixed with kernel 4.18.x: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/dsa/b53?h=linux-4.18.y So current status is, that kernel crashes regularly, see some samples below. It is typically a "Unable to handle kernel paging request at virtual addres" Another interesting thing: A Banana Pro works well (which has also an Allwinner A20 in the same revision) running same Fedora 29 and latest kernels (e.g. kernel 4.20.10-200.fc29.armv7hl.). Since it happens on 2 different devices and with different power supplies (all with enough power) and also the same type which works well on the working old kernel) a hardware issue is very unlikely. I guess it has something to do with virtual memory. Any ideas? [47322.960193] Unable to handle kernel paging request at virtual addres 5675d0 That line is a bit suspicious Anyway, cpufreq is known to cause those kind of errors when the voltage / frequency association is not correct. Given the stack trace and that the BananaPro doesn't have cpufreq enabled, my first guess would be that it's what's happening. Could you try using the performance governor and see if it's more stable? If it is, then using this: https://github.com/ssvb/cpuburn-arm/blob/master/cpufreq-ljt-stress-test will help you find the offending voltage-frequency couple. Maxime For me it looks like they have all the same config regarding cpu governor (Banana Pro, old kernel stable one, new kernel unstable ones) They all have the ondemand governor set: I set on the 2 unstable "new kernel Banana Pi R1": # Set to max performance echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor echo "performance" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor Running some stress tests are ok (I did that already in the past, but without setting maximum performance governor). Let's see if it helps. Thnx. Ciao, Gerhard # Banana Pro: Stable ./cpu_freq.sh /sys/devices/system/cpu/cpu0/cpufreq/affected_cpus: 0 1 /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq: 96 /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq: 96 /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq: 144000 /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_transition_latency: 244144 /sys/devices/system/cpu/cpu0/cpufreq/related_cpus: 0 1 /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies: 144000 312000 528000 72 864000 912000 96 /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors: conservative userspace powersave ondemand performance schedutil /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: 96 /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver: cpufreq-dt /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor: ondemand /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: 96 /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq: 144000 /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed: /sys/devices/system/cpu/cpu1/cpufreq/affected_cpus: 0 1 /sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_cur_freq: 912000 /sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_max_freq: 96 /sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_min_freq: 144000 /sys/devices/system/cpu/cpu1/cpufreq/cpuinfo_transition_latency: 244144 /sys/devices/system/cpu/cpu1/cpufreq/related_cpus: 0 1 /sys/devices/system/cpu/cpu1/cpufreq/scaling_available_frequencies: 144000 312000 528000 72 864000 912000 96 /sys/devices/system/cpu/cpu1/cpufreq/scaling_available_governors: conservative userspace powersave ondemand performance schedutil /sys/devices/system/cpu/cpu1/cpufreq/scaling_cur_freq: 912000 /sys/devices/system/cpu/cpu1/cpufreq/scaling_driver: cpufreq-dt /sys/devices/system/cpu/cpu1/cpufreq/
Banana Pi-R1 stabil
Hello, I've 3 Banana Pi R1, one running with self compiled kernel 4.7.4-200.BPiR1.fc24.armv7hl and old Fedora 25 which is VERY STABLE, the 2 others are running with Fedora 29 latest, kernel 4.20.10-200.fc29.armv7hl. I tried a lot of kernels between of around 4.11 (kernel-4.11.10-200.fc25.armv7hl) until 4.20.10 but all had crashes without any output on the serial console or kernel panics after a short time of period (minutes, hours, max. days) Latest known working and stable self compiled kernel: kernel 4.7.4-200.BPiR1.fc24.armv7hl: https://www.wiesinger.com/opensource/fedora/kernel/BananaPi-R1/ With 4.8.x the DSA b53 switch infrastructure has been introduced which didn't work (until ca8931948344c485569b04821d1f6bcebccd376b and kernel 4.18.x): https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/dsa/b53?h=v4.20.12 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/dsa/b53?h=v4.20.12 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/drivers/net/dsa/b53?h=v4.20.12=ca8931948344c485569b04821d1f6bcebccd376b I has been fixed with kernel 4.18.x: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/dsa/b53?h=linux-4.18.y So current status is, that kernel crashes regularly, see some samples below. It is typically a "Unable to handle kernel paging request at virtual addres" Another interesting thing: A Banana Pro works well (which has also an Allwinner A20 in the same revision) running same Fedora 29 and latest kernels (e.g. kernel 4.20.10-200.fc29.armv7hl.). Since it happens on 2 different devices and with different power supplies (all with enough power) and also the same type which works well on the working old kernel) a hardware issue is very unlikely. I guess it has something to do with virtual memory. Any ideas? Thanx. Ciao, Gerhard [47322.960193] Unable to handle kernel paging request at virtual addres 5675d0 [47322.967832] pgd = c4567fe6 [47322.970913] [085675d0] *pgd= [47322.974795] Internal error: Oops: 5 [#1] SMP ARM [47322.979522] Modules linked in: xt_recent xt_comment ip_set_hash_net ip_set xt_addrtype iptable_nat nf_nat_ipv4 xt_mark iptable_mangle xt_CT iptable_raw xt_multiport xt_conntrack nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_conntrack_sane nf_conntrack_netlink nfnetlink nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_conntrack_tftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp ts_kmp nf_conntrack_amanda nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c 8021q garp mrp rtl8xxxu arc4 rtl8192cu rtl_usb rtl8192c_common rtlwifi mac80211 cfg80211 huawei_cdc_ncm cdc_wdm cdc_ncm option usbnet mii usb_wwan rfkill b53_mdio b53_common dsa_core sun4i_codec bridge snd_soc_core stp llc axp20x_pek ac97_bus phylink snd_pcm_dmaengine axp20x_adc snd_pcm devlink sun4i_backend snd_timer [47322.980312] sun4i_gpadc_iio snd sunxi_cir sun4i_ts nvmem_sunxi_sid rc_core soundcore sun4i_drm sunxi_wdt sun4i_ss sun4i_frontend sun4i_tcon des_generic sun4i_drm_hdmi sun8i_tcon_top drm_kms_helper spi_sun4i drm fb_sys_fops syscopyarea sysfillrect sysimgblt leds_gpio cpufreq_dt axp20x_usb_power axp20x_battery axp20x_ac_power industrialio axp20x_regulator pinctrl_axp209 mmc_block dwmac_sunxi stmmac_platform sunxi phy_generic stmmac musb_hdrc i2c_mv64xxx sun4i_gpadc ahci_sunxi udc_core phy_sun4i_usb libahci_platform ohci_platform ehci_platform sun4i_dma sunxi_mmc rtc_ds1307 i2c_dev [47323.120402] CPU: 1 PID: 31989 Comm: kworker/1:4 Not tainted 4.20.10-200.fc29.armv7hl #1 [47323.128536] Hardware name: Allwinner sun7i (A20) Family [47323.133910] Workqueue: events dbs_work_handler [47323.138500] PC is at regulator_set_voltage_unlocked+0x14/0x304 [47323.144456] LR is at regulator_set_voltage+0x34/0x48 [47323.149524] pc : [] lr : [] psr: 60070013 [47323.155898] sp : eb23ddf8 ip : fp : c9567580 [47323.161222] r10: 365c0400 r9 : 000f4240 r8 : 000f4240 [47323.166552] r7 : ef692050 r6 : 08567580 r5 : 000f4240 r4 : 08567580 [47323.173190] r3 : r2 : 000f4240 r1 : 000f4240 r0 : 08567580 [47323.179832] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [47323.187085] Control: 10c5387d Table: 6d53c06a DAC: 0051 [47323.192950] Process kworker/1:4 (pid: 31989, stack limit = 0x40c1176f) [47323.199582] Stack: (0xeb23ddf8 to 0xeb23e000) [47323.204045] dde0: ef034e40 016e3600 [47323.212397] de00: 365c0400 365c0400 c9567580 c07404fc 02dc6c00 08567580 000f4240 000f4240 [47323.220748] de20: ef692050 000f4240 000f4240 365c0400 c9567580 c078bb38 c9657b2c [47323.229100] de40: c9567580 c0957fe8 08954400 c12bcd08 c975683c 08954400 ee739a40 00690050 [47323.237450] de60: c9756800
net: dsa: b53: Keep CPU port as tagged in all VLANs - merge request
Hello David, The dsa b53 net driver is broken since 4.15 kernels. This patch hasn't been merged into 4.18.latest yet (is already in net.git). Can you please integrate it in 4.18.15. https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/drivers/net/dsa/b53/b53_common.c?id=ca8931948344c485569b04821d1f6bcebccd376b References: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/dsa/b53/b53_common.c?h=v4.18.14 Thank you. Ciao, Gerhard
net: dsa: b53: Keep CPU port as tagged in all VLANs - merge request
Hello David, The dsa b53 net driver is broken since 4.15 kernels. This patch hasn't been merged into 4.18.latest yet (is already in net.git). Can you please integrate it in 4.18.15. https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/drivers/net/dsa/b53/b53_common.c?id=ca8931948344c485569b04821d1f6bcebccd376b References: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/dsa/b53/b53_common.c?h=v4.18.14 Thank you. Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26 - systemd-networkd problem
On 27.05.2018 22:31, Florian Fainelli wrote: Le 05/27/18 à 12:01, Gerhard Wiesinger a écrit : On 24.05.2018 08:22, Gerhard Wiesinger wrote: On 24.05.2018 07:29, Gerhard Wiesinger wrote: After some analysis with Florian (thnx) we found out that the current implementation is broken: https://patchwork.ozlabs.org/patch/836538/ https://github.com/torvalds/linux/commit/c499696e7901bda18385ac723b7bd27c3a4af624#diff-a2b6f8d89e18de600e873ac3ac43fa1d Florians comment: c499696e7901bda18385ac723b7bd27c3a4af624 ("net: dsa: b53: Stop using dev->cpu_port incorrectly") since it would result in no longer setting the CPU port as tagged for a specific VLAN. Easiest way for you right now is to just revert it, but this needs some more thoughts for a proper upstream change. I will think about it some more. Can confirm 4.14.18-200.fc26.armv7hl works, 4.15.x should be broken. # Kernel 4.14.x ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.14.43 # Kernel 4.15.x should be NOT ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.15.18 Kernel 4.14.18-300.fc27.armv7hl works well so far, even with FC28 update. Florian send me a patch to try for 4.16.x So does my patch make 4.16 work correctly for you now? If so, can I just submit it and copy you? I got the commands below to work with manual script commands. Afterwards I wrote systemd-networkd config where I've a strage problem when IPv6 sends a multicast broadcast from another machine to the bridge this will be sent back via the network interface, but with the source MAC of the bridge of the other machine. dmesg from the other machine: [117768.330444] br0: received packet on lan0 with own address as source address (addr:a0:36:9f:ab:cd:ef, vlan:0) [117768.334887] br0: received packet on lan0 with own address as source address (addr:a0:36:9f:ab:cd:ef, vlan:0) [117768.339281] br0: received packet on lan0 with own address as source address (addr:a0:36:9f:ab:cd:ef, vlan:0) And: If I just enter this command after e.g. a systemd-network restart everything is fine forever: # Not OK (dmesg message above is triggered on a remote computer, whole switching network gets unstable, ssh terminals close, packet loss, etc.) systemctl restart systemd-networkd # OK again when this command is entered bridge vlan add dev wan vid 102 pvid untagged brctl show, ip link, bridge vlan, bridge link commands, etc. look all the same, also /sys/class/net/br0/bridge, /sys/class/net/br1/bridge settings Systemd config correct? Any ideas? You should not have eth0.101 and eth0.102 to be enslaved in a bridge at all, this is what is causing the bridge to be confused. Remember what I wrote to you before, with the current b53 driver that does not have any tagging enabled the lanX interfaces and brX interfaces are only used for control and should not be used for passing any data. The only network device that will be passing data is eth0, which is why we need to set-up VLAN interfaces to pop/push the VLAN id accordingly. I have no idea why manual vs. systemd does not work but you can most certainly troubleshoot that by comparing the bridge/ip outputs. So is that then the correct structure? br1 - lan1 (with VID 101) - lan2 (with VID 101) - lan3 (with VID 101) - lan4 (with VID 101) brlan - eth0.101 - wlan0 (currently not active, could be optimized without bridge but for future comfort) br2 - wan (with VID 102) (could be optimized without bridge but for future comfort) - future1 brwan - eth0.102 (could be optimized without bridge but for future comfort) - future2 Ad systemd vs. manual config: As I said I didn't find any difference in the bridge/ip outputs. As they are broken (see other message) maybe something else is broken, too. Thnx. Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26 - systemd-networkd problem
On 27.05.2018 22:31, Florian Fainelli wrote: Le 05/27/18 à 12:01, Gerhard Wiesinger a écrit : On 24.05.2018 08:22, Gerhard Wiesinger wrote: On 24.05.2018 07:29, Gerhard Wiesinger wrote: After some analysis with Florian (thnx) we found out that the current implementation is broken: https://patchwork.ozlabs.org/patch/836538/ https://github.com/torvalds/linux/commit/c499696e7901bda18385ac723b7bd27c3a4af624#diff-a2b6f8d89e18de600e873ac3ac43fa1d Florians comment: c499696e7901bda18385ac723b7bd27c3a4af624 ("net: dsa: b53: Stop using dev->cpu_port incorrectly") since it would result in no longer setting the CPU port as tagged for a specific VLAN. Easiest way for you right now is to just revert it, but this needs some more thoughts for a proper upstream change. I will think about it some more. Can confirm 4.14.18-200.fc26.armv7hl works, 4.15.x should be broken. # Kernel 4.14.x ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.14.43 # Kernel 4.15.x should be NOT ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.15.18 Kernel 4.14.18-300.fc27.armv7hl works well so far, even with FC28 update. Florian send me a patch to try for 4.16.x So does my patch make 4.16 work correctly for you now? If so, can I just submit it and copy you? I got the commands below to work with manual script commands. Afterwards I wrote systemd-networkd config where I've a strage problem when IPv6 sends a multicast broadcast from another machine to the bridge this will be sent back via the network interface, but with the source MAC of the bridge of the other machine. dmesg from the other machine: [117768.330444] br0: received packet on lan0 with own address as source address (addr:a0:36:9f:ab:cd:ef, vlan:0) [117768.334887] br0: received packet on lan0 with own address as source address (addr:a0:36:9f:ab:cd:ef, vlan:0) [117768.339281] br0: received packet on lan0 with own address as source address (addr:a0:36:9f:ab:cd:ef, vlan:0) And: If I just enter this command after e.g. a systemd-network restart everything is fine forever: # Not OK (dmesg message above is triggered on a remote computer, whole switching network gets unstable, ssh terminals close, packet loss, etc.) systemctl restart systemd-networkd # OK again when this command is entered bridge vlan add dev wan vid 102 pvid untagged brctl show, ip link, bridge vlan, bridge link commands, etc. look all the same, also /sys/class/net/br0/bridge, /sys/class/net/br1/bridge settings Systemd config correct? Any ideas? You should not have eth0.101 and eth0.102 to be enslaved in a bridge at all, this is what is causing the bridge to be confused. Remember what I wrote to you before, with the current b53 driver that does not have any tagging enabled the lanX interfaces and brX interfaces are only used for control and should not be used for passing any data. The only network device that will be passing data is eth0, which is why we need to set-up VLAN interfaces to pop/push the VLAN id accordingly. I have no idea why manual vs. systemd does not work but you can most certainly troubleshoot that by comparing the bridge/ip outputs. So is that then the correct structure? br1 - lan1 (with VID 101) - lan2 (with VID 101) - lan3 (with VID 101) - lan4 (with VID 101) brlan - eth0.101 - wlan0 (currently not active, could be optimized without bridge but for future comfort) br2 - wan (with VID 102) (could be optimized without bridge but for future comfort) - future1 brwan - eth0.102 (could be optimized without bridge but for future comfort) - future2 Ad systemd vs. manual config: As I said I didn't find any difference in the bridge/ip outputs. As they are broken (see other message) maybe something else is broken, too. Thnx. Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26 - systemd-networkd problem
On 27.05.2018 22:35, Florian Fainelli wrote: Le 05/27/18 à 12:18, Gerhard Wiesinger a écrit : On 27.05.2018 21:01, Gerhard Wiesinger wrote: On 24.05.2018 08:22, Gerhard Wiesinger wrote: On 24.05.2018 07:29, Gerhard Wiesinger wrote: After some analysis with Florian (thnx) we found out that the current implementation is broken: https://patchwork.ozlabs.org/patch/836538/ https://github.com/torvalds/linux/commit/c499696e7901bda18385ac723b7bd27c3a4af624#diff-a2b6f8d89e18de600e873ac3ac43fa1d Florians comment: c499696e7901bda18385ac723b7bd27c3a4af624 ("net: dsa: b53: Stop using dev->cpu_port incorrectly") since it would result in no longer setting the CPU port as tagged for a specific VLAN. Easiest way for you right now is to just revert it, but this needs some more thoughts for a proper upstream change. I will think about it some more. Can confirm 4.14.18-200.fc26.armv7hl works, 4.15.x should be broken. # Kernel 4.14.x ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.14.43 # Kernel 4.15.x should be NOT ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.15.18 Forgot to mention: What's also strange is that the VLAN ID is very high: # 4.14.18-300.fc27.armv7hl, iproute-4.15.0-1.fc28.armv7hl ip -d link show eth0.101 | grep "vlan protocol" vlan protocol 802.1Q id 3069279796 ip -d link show eth0.102 | grep "vlan protocol" vlan protocol 802.1Q id 3068673588 On older kernels this looks ok: 4.12.8-200.fc25.armv7hl, iproute-4.11.0-1.fc25.armv7hl: ip -d link show eth0.101 | grep "vlan protocol" vlan protocol 802.1Q id 101 ip -d link show eth0.102 | grep "vlan protocol" vlan protocol 802.1Q id 102 Ideas? That is quite likely a kernel/iproute2 issue, if you configured the switch through bridge vlan to have the ports in VLAN 101 and VLAN 102 and you do indeed see frames entering eth0 with these VLAN IDs, then clearly the bridge -> switchdev -> dsa -> b53 part is working just fine and what you are seeing is some for of kernel header/netlink incompatibility. Yes, sniffing on eth0 shows the correct VLAN IDs, e.g. 101. Yes, my guess is that tools are wrong and have random values on 2 calls in different values (e.g. alsopromiscuity ) , see below Who can fix it? BTW: On FC27 same issue with same kernel version, but guess older iproute version. Ciao, Gerhard ip -d link show eth0.101 13: eth0.101@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UP mode DEFAULT group default qlen 1000 link/ether 02:18:09:ab:cd:ef brd ff:ff:ff:ff:ff:ff promiscuity 3068661300 vlan protocol 802.1Q id 3068661300 bridge_slave state forwarding priority 32 cost 4 hairpin off guard off root_block off fastleave off learning on flood on port_id 0x8005 port_no 0x5 designa ted_port 3068661300 designated_cost 3068661300 designated_bridge 8000.66:5d:a2:ab:cd:ef designated_root 8000.66:5d:a2:ab:cd:ef hold_timer 0.00 message_age_tim er 0.00 forward_delay_timer 0.00 topology_change_ack 3068661300 config_pending 3068661300 proxy_arp off proxy_arp_wifi off mcast_router 3068661300 mcast_ fast_leave off mcast_flood on vlan_tunnel off addrgenmode eui64 numtxqueues 3068661300 numrxqueues 3068661300 gso_max_size 3068661300 gso_max_segs 3068661300 ip -d link show eth0.101 13: eth0.101@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UP mode DEFAULT group default qlen 1000 link/ether 02:18:09:ab:cd:ef brd ff:ff:ff:ff:ff:ff promiscuity 3068735028 vlan protocol 802.1Q id 3068735028 bridge_slave state forwarding priority 32 cost 4 hairpin off guard off root_block off fastleave off learning on flood on port_id 0x8005 port_no 0x5 designa ted_port 3068735028 designated_cost 3068735028 designated_bridge 8000.66:5d:ab:cd:ef designated_root 8000.66:5d:a2:ab:cd:ef hold_timer 0.00 message_age_tim er 0.00 forward_delay_timer 0.00 topology_change_ack 3068735028 config_pending 3068735028 proxy_arp off proxy_arp_wifi off mcast_router 3068735028 mcast_ fast_leave off mcast_flood on vlan_tunnel off addrgenmode eui64 numtxqueues 3068735028 numrxqueues 3068735028 gso_max_size 3068735028 gso_max_segs 3068735028
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26 - systemd-networkd problem
On 27.05.2018 22:35, Florian Fainelli wrote: Le 05/27/18 à 12:18, Gerhard Wiesinger a écrit : On 27.05.2018 21:01, Gerhard Wiesinger wrote: On 24.05.2018 08:22, Gerhard Wiesinger wrote: On 24.05.2018 07:29, Gerhard Wiesinger wrote: After some analysis with Florian (thnx) we found out that the current implementation is broken: https://patchwork.ozlabs.org/patch/836538/ https://github.com/torvalds/linux/commit/c499696e7901bda18385ac723b7bd27c3a4af624#diff-a2b6f8d89e18de600e873ac3ac43fa1d Florians comment: c499696e7901bda18385ac723b7bd27c3a4af624 ("net: dsa: b53: Stop using dev->cpu_port incorrectly") since it would result in no longer setting the CPU port as tagged for a specific VLAN. Easiest way for you right now is to just revert it, but this needs some more thoughts for a proper upstream change. I will think about it some more. Can confirm 4.14.18-200.fc26.armv7hl works, 4.15.x should be broken. # Kernel 4.14.x ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.14.43 # Kernel 4.15.x should be NOT ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.15.18 Forgot to mention: What's also strange is that the VLAN ID is very high: # 4.14.18-300.fc27.armv7hl, iproute-4.15.0-1.fc28.armv7hl ip -d link show eth0.101 | grep "vlan protocol" vlan protocol 802.1Q id 3069279796 ip -d link show eth0.102 | grep "vlan protocol" vlan protocol 802.1Q id 3068673588 On older kernels this looks ok: 4.12.8-200.fc25.armv7hl, iproute-4.11.0-1.fc25.armv7hl: ip -d link show eth0.101 | grep "vlan protocol" vlan protocol 802.1Q id 101 ip -d link show eth0.102 | grep "vlan protocol" vlan protocol 802.1Q id 102 Ideas? That is quite likely a kernel/iproute2 issue, if you configured the switch through bridge vlan to have the ports in VLAN 101 and VLAN 102 and you do indeed see frames entering eth0 with these VLAN IDs, then clearly the bridge -> switchdev -> dsa -> b53 part is working just fine and what you are seeing is some for of kernel header/netlink incompatibility. Yes, sniffing on eth0 shows the correct VLAN IDs, e.g. 101. Yes, my guess is that tools are wrong and have random values on 2 calls in different values (e.g. alsopromiscuity ) , see below Who can fix it? BTW: On FC27 same issue with same kernel version, but guess older iproute version. Ciao, Gerhard ip -d link show eth0.101 13: eth0.101@eth0: mtu 1500 qdisc noqueue master br0 state UP mode DEFAULT group default qlen 1000 link/ether 02:18:09:ab:cd:ef brd ff:ff:ff:ff:ff:ff promiscuity 3068661300 vlan protocol 802.1Q id 3068661300 bridge_slave state forwarding priority 32 cost 4 hairpin off guard off root_block off fastleave off learning on flood on port_id 0x8005 port_no 0x5 designa ted_port 3068661300 designated_cost 3068661300 designated_bridge 8000.66:5d:a2:ab:cd:ef designated_root 8000.66:5d:a2:ab:cd:ef hold_timer 0.00 message_age_tim er 0.00 forward_delay_timer 0.00 topology_change_ack 3068661300 config_pending 3068661300 proxy_arp off proxy_arp_wifi off mcast_router 3068661300 mcast_ fast_leave off mcast_flood on vlan_tunnel off addrgenmode eui64 numtxqueues 3068661300 numrxqueues 3068661300 gso_max_size 3068661300 gso_max_segs 3068661300 ip -d link show eth0.101 13: eth0.101@eth0: mtu 1500 qdisc noqueue master br0 state UP mode DEFAULT group default qlen 1000 link/ether 02:18:09:ab:cd:ef brd ff:ff:ff:ff:ff:ff promiscuity 3068735028 vlan protocol 802.1Q id 3068735028 bridge_slave state forwarding priority 32 cost 4 hairpin off guard off root_block off fastleave off learning on flood on port_id 0x8005 port_no 0x5 designa ted_port 3068735028 designated_cost 3068735028 designated_bridge 8000.66:5d:ab:cd:ef designated_root 8000.66:5d:a2:ab:cd:ef hold_timer 0.00 message_age_tim er 0.00 forward_delay_timer 0.00 topology_change_ack 3068735028 config_pending 3068735028 proxy_arp off proxy_arp_wifi off mcast_router 3068735028 mcast_ fast_leave off mcast_flood on vlan_tunnel off addrgenmode eui64 numtxqueues 3068735028 numrxqueues 3068735028 gso_max_size 3068735028 gso_max_segs 3068735028
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26 - systemd-networkd problem
On 27.05.2018 21:01, Gerhard Wiesinger wrote: On 24.05.2018 08:22, Gerhard Wiesinger wrote: On 24.05.2018 07:29, Gerhard Wiesinger wrote: After some analysis with Florian (thnx) we found out that the current implementation is broken: https://patchwork.ozlabs.org/patch/836538/ https://github.com/torvalds/linux/commit/c499696e7901bda18385ac723b7bd27c3a4af624#diff-a2b6f8d89e18de600e873ac3ac43fa1d Florians comment: c499696e7901bda18385ac723b7bd27c3a4af624 ("net: dsa: b53: Stop using dev->cpu_port incorrectly") since it would result in no longer setting the CPU port as tagged for a specific VLAN. Easiest way for you right now is to just revert it, but this needs some more thoughts for a proper upstream change. I will think about it some more. Can confirm 4.14.18-200.fc26.armv7hl works, 4.15.x should be broken. # Kernel 4.14.x ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.14.43 # Kernel 4.15.x should be NOT ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.15.18 Forgot to mention: What's also strange is that the VLAN ID is very high: # 4.14.18-300.fc27.armv7hl, iproute-4.15.0-1.fc28.armv7hl ip -d link show eth0.101 | grep "vlan protocol" vlan protocol 802.1Q id 3069279796 ip -d link show eth0.102 | grep "vlan protocol" vlan protocol 802.1Q id 3068673588 On older kernels this looks ok: 4.12.8-200.fc25.armv7hl, iproute-4.11.0-1.fc25.armv7hl: ip -d link show eth0.101 | grep "vlan protocol" vlan protocol 802.1Q id 101 ip -d link show eth0.102 | grep "vlan protocol" vlan protocol 802.1Q id 102 Ideas? Thank you. Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26 - systemd-networkd problem
On 27.05.2018 21:01, Gerhard Wiesinger wrote: On 24.05.2018 08:22, Gerhard Wiesinger wrote: On 24.05.2018 07:29, Gerhard Wiesinger wrote: After some analysis with Florian (thnx) we found out that the current implementation is broken: https://patchwork.ozlabs.org/patch/836538/ https://github.com/torvalds/linux/commit/c499696e7901bda18385ac723b7bd27c3a4af624#diff-a2b6f8d89e18de600e873ac3ac43fa1d Florians comment: c499696e7901bda18385ac723b7bd27c3a4af624 ("net: dsa: b53: Stop using dev->cpu_port incorrectly") since it would result in no longer setting the CPU port as tagged for a specific VLAN. Easiest way for you right now is to just revert it, but this needs some more thoughts for a proper upstream change. I will think about it some more. Can confirm 4.14.18-200.fc26.armv7hl works, 4.15.x should be broken. # Kernel 4.14.x ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.14.43 # Kernel 4.15.x should be NOT ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.15.18 Forgot to mention: What's also strange is that the VLAN ID is very high: # 4.14.18-300.fc27.armv7hl, iproute-4.15.0-1.fc28.armv7hl ip -d link show eth0.101 | grep "vlan protocol" vlan protocol 802.1Q id 3069279796 ip -d link show eth0.102 | grep "vlan protocol" vlan protocol 802.1Q id 3068673588 On older kernels this looks ok: 4.12.8-200.fc25.armv7hl, iproute-4.11.0-1.fc25.armv7hl: ip -d link show eth0.101 | grep "vlan protocol" vlan protocol 802.1Q id 101 ip -d link show eth0.102 | grep "vlan protocol" vlan protocol 802.1Q id 102 Ideas? Thank you. Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26 - systemd-networkd problem
On 24.05.2018 08:22, Gerhard Wiesinger wrote: On 24.05.2018 07:29, Gerhard Wiesinger wrote: After some analysis with Florian (thnx) we found out that the current implementation is broken: https://patchwork.ozlabs.org/patch/836538/ https://github.com/torvalds/linux/commit/c499696e7901bda18385ac723b7bd27c3a4af624#diff-a2b6f8d89e18de600e873ac3ac43fa1d Florians comment: c499696e7901bda18385ac723b7bd27c3a4af624 ("net: dsa: b53: Stop using dev->cpu_port incorrectly") since it would result in no longer setting the CPU port as tagged for a specific VLAN. Easiest way for you right now is to just revert it, but this needs some more thoughts for a proper upstream change. I will think about it some more. Can confirm 4.14.18-200.fc26.armv7hl works, 4.15.x should be broken. # Kernel 4.14.x ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.14.43 # Kernel 4.15.x should be NOT ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.15.18 Kernel 4.14.18-300.fc27.armv7hl works well so far, even with FC28 update. Florian send me a patch to try for 4.16.x I got the commands below to work with manual script commands. Afterwards I wrote systemd-networkd config where I've a strage problem when IPv6 sends a multicast broadcast from another machine to the bridge this will be sent back via the network interface, but with the source MAC of the bridge of the other machine. dmesg from the other machine: [117768.330444] br0: received packet on lan0 with own address as source address (addr:a0:36:9f:ab:cd:ef, vlan:0) [117768.334887] br0: received packet on lan0 with own address as source address (addr:a0:36:9f:ab:cd:ef, vlan:0) [117768.339281] br0: received packet on lan0 with own address as source address (addr:a0:36:9f:ab:cd:ef, vlan:0) And: If I just enter this command after e.g. a systemd-network restart everything is fine forever: # Not OK (dmesg message above is triggered on a remote computer, whole switching network gets unstable, ssh terminals close, packet loss, etc.) systemctl restart systemd-networkd # OK again when this command is entered bridge vlan add dev wan vid 102 pvid untagged brctl show, ip link, bridge vlan, bridge link commands, etc. look all the same, also /sys/class/net/br0/bridge, /sys/class/net/br1/bridge settings Systemd config correct? Any ideas? Thank you. Ciao, Gerhard brctl show bridge name bridge id STP enabled interfaces br0 8000.665da2abcdef no eth0.101 lan1 lan2 lan3 lan4 br1 8000.9a4557abcdef no eth0.102 wan bridge vlan show port vlan ids lan2 101 PVID Egress Untagged lan3 101 PVID Egress Untagged lan4 101 PVID Egress Untagged wan 102 PVID Egress Untagged lan1 101 PVID Egress Untagged br1 None br0 None eth0.102 None eth0.101 None OK: manual scripts ip link add link eth0 name eth0.101 type vlan id 101 ip link set eth0.101 up ip link add link eth0 name eth0.102 type vlan id 102 ip link set eth0.102 up ip link add br0 type bridge ip link set dev br0 type bridge stp_state 0 ip link set lan1 master br0 bridge vlan add dev lan1 vid 101 pvid untagged ip link set lan1 up ip link set lan2 master br0 bridge vlan add dev lan2 vid 101 pvid untagged ip link set lan2 up ip link set lan3 master br0 bridge vlan add dev lan3 vid 101 pvid untagged ip link set lan3 up ip link set lan4 master br0 bridge vlan add dev lan4 vid 101 pvid untagged ip link set lan4 up ip link set eth0.101 master br0 ip link set eth0.101 up ip link set br0 up ip link add br1 type bridge ip link set dev br1 type bridge stp_state 0 ip link set wan master br1 bridge vlan add dev wan vid 102 pvid untagged ip link set wan up ip link set eth0.102 master br1 ip link set eth0.102 up ip link set br1 up ip addr flush dev br0 ip addr add 192.168.0.250/24 dev br0 ip route del default via 192.168.0.1 dev br0 ip route add default via 192.168.0.1 dev br0 ip addr flush dev br1 ip addr add 192.168.1.1/24 dev br1 NOK: after a multicast packe
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26 - systemd-networkd problem
On 24.05.2018 08:22, Gerhard Wiesinger wrote: On 24.05.2018 07:29, Gerhard Wiesinger wrote: After some analysis with Florian (thnx) we found out that the current implementation is broken: https://patchwork.ozlabs.org/patch/836538/ https://github.com/torvalds/linux/commit/c499696e7901bda18385ac723b7bd27c3a4af624#diff-a2b6f8d89e18de600e873ac3ac43fa1d Florians comment: c499696e7901bda18385ac723b7bd27c3a4af624 ("net: dsa: b53: Stop using dev->cpu_port incorrectly") since it would result in no longer setting the CPU port as tagged for a specific VLAN. Easiest way for you right now is to just revert it, but this needs some more thoughts for a proper upstream change. I will think about it some more. Can confirm 4.14.18-200.fc26.armv7hl works, 4.15.x should be broken. # Kernel 4.14.x ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.14.43 # Kernel 4.15.x should be NOT ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.15.18 Kernel 4.14.18-300.fc27.armv7hl works well so far, even with FC28 update. Florian send me a patch to try for 4.16.x I got the commands below to work with manual script commands. Afterwards I wrote systemd-networkd config where I've a strage problem when IPv6 sends a multicast broadcast from another machine to the bridge this will be sent back via the network interface, but with the source MAC of the bridge of the other machine. dmesg from the other machine: [117768.330444] br0: received packet on lan0 with own address as source address (addr:a0:36:9f:ab:cd:ef, vlan:0) [117768.334887] br0: received packet on lan0 with own address as source address (addr:a0:36:9f:ab:cd:ef, vlan:0) [117768.339281] br0: received packet on lan0 with own address as source address (addr:a0:36:9f:ab:cd:ef, vlan:0) And: If I just enter this command after e.g. a systemd-network restart everything is fine forever: # Not OK (dmesg message above is triggered on a remote computer, whole switching network gets unstable, ssh terminals close, packet loss, etc.) systemctl restart systemd-networkd # OK again when this command is entered bridge vlan add dev wan vid 102 pvid untagged brctl show, ip link, bridge vlan, bridge link commands, etc. look all the same, also /sys/class/net/br0/bridge, /sys/class/net/br1/bridge settings Systemd config correct? Any ideas? Thank you. Ciao, Gerhard brctl show bridge name bridge id STP enabled interfaces br0 8000.665da2abcdef no eth0.101 lan1 lan2 lan3 lan4 br1 8000.9a4557abcdef no eth0.102 wan bridge vlan show port vlan ids lan2 101 PVID Egress Untagged lan3 101 PVID Egress Untagged lan4 101 PVID Egress Untagged wan 102 PVID Egress Untagged lan1 101 PVID Egress Untagged br1 None br0 None eth0.102 None eth0.101 None OK: manual scripts ip link add link eth0 name eth0.101 type vlan id 101 ip link set eth0.101 up ip link add link eth0 name eth0.102 type vlan id 102 ip link set eth0.102 up ip link add br0 type bridge ip link set dev br0 type bridge stp_state 0 ip link set lan1 master br0 bridge vlan add dev lan1 vid 101 pvid untagged ip link set lan1 up ip link set lan2 master br0 bridge vlan add dev lan2 vid 101 pvid untagged ip link set lan2 up ip link set lan3 master br0 bridge vlan add dev lan3 vid 101 pvid untagged ip link set lan3 up ip link set lan4 master br0 bridge vlan add dev lan4 vid 101 pvid untagged ip link set lan4 up ip link set eth0.101 master br0 ip link set eth0.101 up ip link set br0 up ip link add br1 type bridge ip link set dev br1 type bridge stp_state 0 ip link set wan master br1 bridge vlan add dev wan vid 102 pvid untagged ip link set wan up ip link set eth0.102 master br1 ip link set eth0.102 up ip link set br1 up ip addr flush dev br0 ip addr add 192.168.0.250/24 dev br0 ip route del default via 192.168.0.1 dev br0 ip route add default via 192.168.0.1 dev br0 ip addr flush dev br1 ip addr add 192.168.1.1/24 dev br1 NOK: after a multicast packe
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26
On 24.05.2018 07:29, Gerhard Wiesinger wrote: After some analysis with Florian (thnx) we found out that the current implementation is broken: https://patchwork.ozlabs.org/patch/836538/ https://github.com/torvalds/linux/commit/c499696e7901bda18385ac723b7bd27c3a4af624#diff-a2b6f8d89e18de600e873ac3ac43fa1d Florians comment: c499696e7901bda18385ac723b7bd27c3a4af624 ("net: dsa: b53: Stop using dev->cpu_port incorrectly") since it would result in no longer setting the CPU port as tagged for a specific VLAN. Easiest way for you right now is to just revert it, but this needs some more thoughts for a proper upstream change. I will think about it some more. Can confirm 4.14.18-200.fc26.armv7hl works, 4.15.x should be broken. # Kernel 4.14.x ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.14.43 # Kernel 4.15.x should be NOT ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.15.18 Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26
On 24.05.2018 07:29, Gerhard Wiesinger wrote: After some analysis with Florian (thnx) we found out that the current implementation is broken: https://patchwork.ozlabs.org/patch/836538/ https://github.com/torvalds/linux/commit/c499696e7901bda18385ac723b7bd27c3a4af624#diff-a2b6f8d89e18de600e873ac3ac43fa1d Florians comment: c499696e7901bda18385ac723b7bd27c3a4af624 ("net: dsa: b53: Stop using dev->cpu_port incorrectly") since it would result in no longer setting the CPU port as tagged for a specific VLAN. Easiest way for you right now is to just revert it, but this needs some more thoughts for a proper upstream change. I will think about it some more. Can confirm 4.14.18-200.fc26.armv7hl works, 4.15.x should be broken. # Kernel 4.14.x ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.14.43 # Kernel 4.15.x should be NOT ok https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/drivers/net/dsa/b53?h=v4.15.18 Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26
After some analysis with Florian (thnx) we found out that the current implementation is broken: https://patchwork.ozlabs.org/patch/836538/ https://github.com/torvalds/linux/commit/c499696e7901bda18385ac723b7bd27c3a4af624#diff-a2b6f8d89e18de600e873ac3ac43fa1d Florians comment: c499696e7901bda18385ac723b7bd27c3a4af624 ("net: dsa: b53: Stop using dev->cpu_port incorrectly") since it would result in no longer setting the CPU port as tagged for a specific VLAN. Easiest way for you right now is to just revert it, but this needs some more thoughts for a proper upstream change. I will think about it some more. Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26
After some analysis with Florian (thnx) we found out that the current implementation is broken: https://patchwork.ozlabs.org/patch/836538/ https://github.com/torvalds/linux/commit/c499696e7901bda18385ac723b7bd27c3a4af624#diff-a2b6f8d89e18de600e873ac3ac43fa1d Florians comment: c499696e7901bda18385ac723b7bd27c3a4af624 ("net: dsa: b53: Stop using dev->cpu_port incorrectly") since it would result in no longer setting the CPU port as tagged for a specific VLAN. Easiest way for you right now is to just revert it, but this needs some more thoughts for a proper upstream change. I will think about it some more. Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26
On 23.05.2018 19:55, Florian Fainelli wrote: On 05/23/2018 10:35 AM, Gerhard Wiesinger wrote: On 23.05.2018 17:28, Florian Fainelli wrote: And in the future (time plan)? If you don't care about multicast then you can use those patches: https://github.com/ffainelli/linux/commit/de055bf5f34e9806463ab2793e0852f5dfc380df and you have to change the part of drivers/net/dsa/b53/b53_common.c that returns DSA_TAG_PROTO_NONE for 53125: diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 9f561fe505cb..3c64f026a8ce 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1557,7 +1557,7 @@ enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds, int port) * mode to be turned on which means we need to specifically manage ARL * misses on multicast addresses (TBD). */ - if (is5325(dev) || is5365(dev) || is539x(dev) || is531x5(dev) || + if (is5325(dev) || is5365(dev) || is539x(dev) || !b53_can_enable_brcm_tags(ds, port)) return DSA_TAG_PROTO_NONE; That would bring Broadcom tags to the 53125 switch and you would be able to use the configuration lines from Andrew in that case. What's the plan here regarding these 2 config option mode (how do you call them?)? Broadcom tags is the underlying feature that provides per-port information about the packets going in and out. Turning on Broadcom tags requires turning on managed mode which means that the host now has to manage how MAC addresses are programmed into the switch, it's not rocket science, but I don't have a good test framework to automate the testing of those changes yet. If you are willing to help in the testing, I can certainly give you patches to try. Yes, patches are welcome. I mean, will this be a breaking change in the future where config has to be done in a different way then? When Broadcom tags are enabled the switch gets usable the way Andrew expressed it, the only difference that makes on your configuration if you want e.g: VLAN 101 to be for port 1-4 and VLAN 102 to be for port 5, is that you no longer create an eth0.101 and eth0.102, but you create br0.101 and br0.102. I think documentation (dsa.txt) should provide more examples. Or will it be configurable via module parameters or /proc or /sys filesystem options? We might be able to expose a sysfs attribute which shows the type of tagging being enabled by a particular switch, that way scripts can detect which variant: configuring the host controller or the bridge is required. Would that be acceptable? Yes, acceptable for me. But what's the long term concept for DSA (and also other implementations)? - "old" mode variant, mode can only be read - "new" mode variant, mode can only be read - mode settable/configurable by the user, mode can be read In general: OK, thank you for your explainations. I think DSA (at least with b53) had secveral topics. implementation bugs, missing documentation, lack of distribution support (e.g. systemd), etc. which were not understood by the users. So everything which clarifies the topics for DSA in the future is welcome. BTW: systemd-networkd support for DSA #7478 https://github.com/systemd/systemd/issues/7478 Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26
On 23.05.2018 19:55, Florian Fainelli wrote: On 05/23/2018 10:35 AM, Gerhard Wiesinger wrote: On 23.05.2018 17:28, Florian Fainelli wrote: And in the future (time plan)? If you don't care about multicast then you can use those patches: https://github.com/ffainelli/linux/commit/de055bf5f34e9806463ab2793e0852f5dfc380df and you have to change the part of drivers/net/dsa/b53/b53_common.c that returns DSA_TAG_PROTO_NONE for 53125: diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 9f561fe505cb..3c64f026a8ce 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1557,7 +1557,7 @@ enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds, int port) * mode to be turned on which means we need to specifically manage ARL * misses on multicast addresses (TBD). */ - if (is5325(dev) || is5365(dev) || is539x(dev) || is531x5(dev) || + if (is5325(dev) || is5365(dev) || is539x(dev) || !b53_can_enable_brcm_tags(ds, port)) return DSA_TAG_PROTO_NONE; That would bring Broadcom tags to the 53125 switch and you would be able to use the configuration lines from Andrew in that case. What's the plan here regarding these 2 config option mode (how do you call them?)? Broadcom tags is the underlying feature that provides per-port information about the packets going in and out. Turning on Broadcom tags requires turning on managed mode which means that the host now has to manage how MAC addresses are programmed into the switch, it's not rocket science, but I don't have a good test framework to automate the testing of those changes yet. If you are willing to help in the testing, I can certainly give you patches to try. Yes, patches are welcome. I mean, will this be a breaking change in the future where config has to be done in a different way then? When Broadcom tags are enabled the switch gets usable the way Andrew expressed it, the only difference that makes on your configuration if you want e.g: VLAN 101 to be for port 1-4 and VLAN 102 to be for port 5, is that you no longer create an eth0.101 and eth0.102, but you create br0.101 and br0.102. I think documentation (dsa.txt) should provide more examples. Or will it be configurable via module parameters or /proc or /sys filesystem options? We might be able to expose a sysfs attribute which shows the type of tagging being enabled by a particular switch, that way scripts can detect which variant: configuring the host controller or the bridge is required. Would that be acceptable? Yes, acceptable for me. But what's the long term concept for DSA (and also other implementations)? - "old" mode variant, mode can only be read - "new" mode variant, mode can only be read - mode settable/configurable by the user, mode can be read In general: OK, thank you for your explainations. I think DSA (at least with b53) had secveral topics. implementation bugs, missing documentation, lack of distribution support (e.g. systemd), etc. which were not understood by the users. So everything which clarifies the topics for DSA in the future is welcome. BTW: systemd-networkd support for DSA #7478 https://github.com/systemd/systemd/issues/7478 Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26
On 23.05.2018 19:47, Florian Fainelli wrote: On 05/23/2018 10:29 AM, Gerhard Wiesinger wrote: On 23.05.2018 17:50, Florian Fainelli wrote: On 05/23/2018 08:28 AM, Florian Fainelli wrote: On 05/22/2018 09:49 PM, Gerhard Wiesinger wrote: On 22.05.2018 22:42, Florian Fainelli wrote: On 05/22/2018 01:16 PM, Andrew Lunn wrote: Planned network structure will be as with 4.7.x kernels: br0 <=> eth0.101 <=> eth0 (vlan 101 tagged) <=> lan 1-lan4 (vlan 101 untagged pvid) br1 <=> eth0.102 <=> eth0 (vlan 102 tagged) <=> wan (vlan 102 untagged pvid) Do you even need these vlans? Yes, remember, b53 does not currently turn on Broadcom tags, so the only way to segregate traffic is to have VLANs for that. Are you doing this for port separation? To keep lan1-4 traffic separate from wan? DSA does that by default, no vlan needed. So you can just do ip link add name br0 type bridge ip link set dev br0 up ip link set dev lan1 master br0 ip link set dev lan2 master br0 ip link set dev lan3 master br0 ip link set dev lan4 master br0 and use interface wan directly, no bridge needed. That would work once Broadcom tags are turned on which requires turning on managed mode, which requires work that I have not been able to get done :) Setup with swconfig: #!/usr/bin/bash INTERFACE=eth0 # Delete all IP addresses and get link up ip addr flush dev ${INTERFACE} ip link set ${INTERFACE} up # Lamobo R1 aka BPi R1 Routerboard # # Speaker | LAN1 | LAN2 | LAN3 | LAN4 || LAN5 | HDMI # SW-Port | P2 | P1 | P0 | P4 || P3 | # VLAN | 11 | 12 | 13 | 14 ||ALL(t)| # # Switch-Port P8 - ALL(t) boards internal CPU Port # Setup switch swconfig dev ${INTERFACE} set reset 1 swconfig dev ${INTERFACE} set enable_vlan 1 swconfig dev ${INTERFACE} vlan 101 set ports '3 8t' swconfig dev ${INTERFACE} vlan 102 set ports '4 0 1 2 8t' swconfig dev ${INTERFACE} set apply 1 How to achieve this setup CURRENTLY with DSA? Your first email had the right programming sequence, but you did not answer whether you have CONFIG_BRIDGE_VLAN_FILTERING enabled or not, which is likely your problem. Here are some reference configurations that should work: https://github.com/armbian/build/issues/511#issuecomment-320473246 I know, some comments are from me but none of them worked, therefore on LKML :-) I see, maybe you could have started there, that would have saved me a trip to github to find out the thread. /boot/config-4.16.7-100.fc26.armv7hl:CONFIG_BRIDGE_VLAN_FILTERING=y so this can't be the issue, any further ideas? Yes, remove the "self" from your bridge vlan commands, I don't see that being necessary. Same: [root@bpi ~]# bridge vlan add dev lan1 vid 101 pvid untagged self RTNETLINK answers: Operation not supported [root@bpi ~]# bridge vlan add dev lan1 vid 101 pvid untagged RTNETLINK answers: Operation not supported [root@bpi ~]# bridge vlan add dev lan1 vid 101 RTNETLINK answers: Operation not supported Any ideas how to debug further? On my 2nd Banana Pi-R1 still on Fedora 25 with kernel 4.12.8-200.fc25.armv7hl the commands still work well, but I wanted to test the upgrade on another one. /boot/config-4.12.8-200.fc25.armv7hl:CONFIG_BRIDGE_VLAN_FILTERING=y Is using an upstream or compiled by yourself kernel an option at all? I have no clue what is in a distribution kernel. Typically the Fedora kernels work fine (long term experience since Fedora Core 1 from 2004 :-) ). I had some custom patches in there in the past for external RTC and b53_switch.kernel_4.5+.patch, but otherwise no topics. Therefore with upstream DSA support that should be fine then. Infos can be found here: https://koji.fedoraproject.org/koji/packageinfo?packageID=8 https://koji.fedoraproject.org/koji/buildinfo?buildID=1078638 Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26
On 23.05.2018 19:47, Florian Fainelli wrote: On 05/23/2018 10:29 AM, Gerhard Wiesinger wrote: On 23.05.2018 17:50, Florian Fainelli wrote: On 05/23/2018 08:28 AM, Florian Fainelli wrote: On 05/22/2018 09:49 PM, Gerhard Wiesinger wrote: On 22.05.2018 22:42, Florian Fainelli wrote: On 05/22/2018 01:16 PM, Andrew Lunn wrote: Planned network structure will be as with 4.7.x kernels: br0 <=> eth0.101 <=> eth0 (vlan 101 tagged) <=> lan 1-lan4 (vlan 101 untagged pvid) br1 <=> eth0.102 <=> eth0 (vlan 102 tagged) <=> wan (vlan 102 untagged pvid) Do you even need these vlans? Yes, remember, b53 does not currently turn on Broadcom tags, so the only way to segregate traffic is to have VLANs for that. Are you doing this for port separation? To keep lan1-4 traffic separate from wan? DSA does that by default, no vlan needed. So you can just do ip link add name br0 type bridge ip link set dev br0 up ip link set dev lan1 master br0 ip link set dev lan2 master br0 ip link set dev lan3 master br0 ip link set dev lan4 master br0 and use interface wan directly, no bridge needed. That would work once Broadcom tags are turned on which requires turning on managed mode, which requires work that I have not been able to get done :) Setup with swconfig: #!/usr/bin/bash INTERFACE=eth0 # Delete all IP addresses and get link up ip addr flush dev ${INTERFACE} ip link set ${INTERFACE} up # Lamobo R1 aka BPi R1 Routerboard # # Speaker | LAN1 | LAN2 | LAN3 | LAN4 || LAN5 | HDMI # SW-Port | P2 | P1 | P0 | P4 || P3 | # VLAN | 11 | 12 | 13 | 14 ||ALL(t)| # # Switch-Port P8 - ALL(t) boards internal CPU Port # Setup switch swconfig dev ${INTERFACE} set reset 1 swconfig dev ${INTERFACE} set enable_vlan 1 swconfig dev ${INTERFACE} vlan 101 set ports '3 8t' swconfig dev ${INTERFACE} vlan 102 set ports '4 0 1 2 8t' swconfig dev ${INTERFACE} set apply 1 How to achieve this setup CURRENTLY with DSA? Your first email had the right programming sequence, but you did not answer whether you have CONFIG_BRIDGE_VLAN_FILTERING enabled or not, which is likely your problem. Here are some reference configurations that should work: https://github.com/armbian/build/issues/511#issuecomment-320473246 I know, some comments are from me but none of them worked, therefore on LKML :-) I see, maybe you could have started there, that would have saved me a trip to github to find out the thread. /boot/config-4.16.7-100.fc26.armv7hl:CONFIG_BRIDGE_VLAN_FILTERING=y so this can't be the issue, any further ideas? Yes, remove the "self" from your bridge vlan commands, I don't see that being necessary. Same: [root@bpi ~]# bridge vlan add dev lan1 vid 101 pvid untagged self RTNETLINK answers: Operation not supported [root@bpi ~]# bridge vlan add dev lan1 vid 101 pvid untagged RTNETLINK answers: Operation not supported [root@bpi ~]# bridge vlan add dev lan1 vid 101 RTNETLINK answers: Operation not supported Any ideas how to debug further? On my 2nd Banana Pi-R1 still on Fedora 25 with kernel 4.12.8-200.fc25.armv7hl the commands still work well, but I wanted to test the upgrade on another one. /boot/config-4.12.8-200.fc25.armv7hl:CONFIG_BRIDGE_VLAN_FILTERING=y Is using an upstream or compiled by yourself kernel an option at all? I have no clue what is in a distribution kernel. Typically the Fedora kernels work fine (long term experience since Fedora Core 1 from 2004 :-) ). I had some custom patches in there in the past for external RTC and b53_switch.kernel_4.5+.patch, but otherwise no topics. Therefore with upstream DSA support that should be fine then. Infos can be found here: https://koji.fedoraproject.org/koji/packageinfo?packageID=8 https://koji.fedoraproject.org/koji/buildinfo?buildID=1078638 Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26
On 23.05.2018 17:28, Florian Fainelli wrote: And in the future (time plan)? If you don't care about multicast then you can use those patches: https://github.com/ffainelli/linux/commit/de055bf5f34e9806463ab2793e0852f5dfc380df and you have to change the part of drivers/net/dsa/b53/b53_common.c that returns DSA_TAG_PROTO_NONE for 53125: diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 9f561fe505cb..3c64f026a8ce 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1557,7 +1557,7 @@ enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds, int port) * mode to be turned on which means we need to specifically manage ARL * misses on multicast addresses (TBD). */ - if (is5325(dev) || is5365(dev) || is539x(dev) || is531x5(dev) || + if (is5325(dev) || is5365(dev) || is539x(dev) || !b53_can_enable_brcm_tags(ds, port)) return DSA_TAG_PROTO_NONE; That would bring Broadcom tags to the 53125 switch and you would be able to use the configuration lines from Andrew in that case. What's the plan here regarding these 2 config option mode (how do you call them?)? I mean, will this be a breaking change in the future where config has to be done in a different way then? Or will it be configurable via module parameters or /proc or /sys filesystem options? Thank you. Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26
On 23.05.2018 17:28, Florian Fainelli wrote: And in the future (time plan)? If you don't care about multicast then you can use those patches: https://github.com/ffainelli/linux/commit/de055bf5f34e9806463ab2793e0852f5dfc380df and you have to change the part of drivers/net/dsa/b53/b53_common.c that returns DSA_TAG_PROTO_NONE for 53125: diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 9f561fe505cb..3c64f026a8ce 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1557,7 +1557,7 @@ enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds, int port) * mode to be turned on which means we need to specifically manage ARL * misses on multicast addresses (TBD). */ - if (is5325(dev) || is5365(dev) || is539x(dev) || is531x5(dev) || + if (is5325(dev) || is5365(dev) || is539x(dev) || !b53_can_enable_brcm_tags(ds, port)) return DSA_TAG_PROTO_NONE; That would bring Broadcom tags to the 53125 switch and you would be able to use the configuration lines from Andrew in that case. What's the plan here regarding these 2 config option mode (how do you call them?)? I mean, will this be a breaking change in the future where config has to be done in a different way then? Or will it be configurable via module parameters or /proc or /sys filesystem options? Thank you. Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26
On 23.05.2018 17:50, Florian Fainelli wrote: On 05/23/2018 08:28 AM, Florian Fainelli wrote: On 05/22/2018 09:49 PM, Gerhard Wiesinger wrote: On 22.05.2018 22:42, Florian Fainelli wrote: On 05/22/2018 01:16 PM, Andrew Lunn wrote: Planned network structure will be as with 4.7.x kernels: br0 <=> eth0.101 <=> eth0 (vlan 101 tagged) <=> lan 1-lan4 (vlan 101 untagged pvid) br1 <=> eth0.102 <=> eth0 (vlan 102 tagged) <=> wan (vlan 102 untagged pvid) Do you even need these vlans? Yes, remember, b53 does not currently turn on Broadcom tags, so the only way to segregate traffic is to have VLANs for that. Are you doing this for port separation? To keep lan1-4 traffic separate from wan? DSA does that by default, no vlan needed. So you can just do ip link add name br0 type bridge ip link set dev br0 up ip link set dev lan1 master br0 ip link set dev lan2 master br0 ip link set dev lan3 master br0 ip link set dev lan4 master br0 and use interface wan directly, no bridge needed. That would work once Broadcom tags are turned on which requires turning on managed mode, which requires work that I have not been able to get done :) Setup with swconfig: #!/usr/bin/bash INTERFACE=eth0 # Delete all IP addresses and get link up ip addr flush dev ${INTERFACE} ip link set ${INTERFACE} up # Lamobo R1 aka BPi R1 Routerboard # # Speaker | LAN1 | LAN2 | LAN3 | LAN4 || LAN5 | HDMI # SW-Port | P2 | P1 | P0 | P4 || P3 | # VLAN | 11 | 12 | 13 | 14 ||ALL(t)| # # Switch-Port P8 - ALL(t) boards internal CPU Port # Setup switch swconfig dev ${INTERFACE} set reset 1 swconfig dev ${INTERFACE} set enable_vlan 1 swconfig dev ${INTERFACE} vlan 101 set ports '3 8t' swconfig dev ${INTERFACE} vlan 102 set ports '4 0 1 2 8t' swconfig dev ${INTERFACE} set apply 1 How to achieve this setup CURRENTLY with DSA? Your first email had the right programming sequence, but you did not answer whether you have CONFIG_BRIDGE_VLAN_FILTERING enabled or not, which is likely your problem. Here are some reference configurations that should work: https://github.com/armbian/build/issues/511#issuecomment-320473246 I know, some comments are from me but none of them worked, therefore on LKML :-) /boot/config-4.16.7-100.fc26.armv7hl:CONFIG_BRIDGE_VLAN_FILTERING=y so this can't be the issue, any further ideas? On my 2nd Banana Pi-R1 still on Fedora 25 with kernel 4.12.8-200.fc25.armv7hl the commands still work well, but I wanted to test the upgrade on another one. /boot/config-4.12.8-200.fc25.armv7hl:CONFIG_BRIDGE_VLAN_FILTERING=y Thnx. Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26
On 23.05.2018 17:50, Florian Fainelli wrote: On 05/23/2018 08:28 AM, Florian Fainelli wrote: On 05/22/2018 09:49 PM, Gerhard Wiesinger wrote: On 22.05.2018 22:42, Florian Fainelli wrote: On 05/22/2018 01:16 PM, Andrew Lunn wrote: Planned network structure will be as with 4.7.x kernels: br0 <=> eth0.101 <=> eth0 (vlan 101 tagged) <=> lan 1-lan4 (vlan 101 untagged pvid) br1 <=> eth0.102 <=> eth0 (vlan 102 tagged) <=> wan (vlan 102 untagged pvid) Do you even need these vlans? Yes, remember, b53 does not currently turn on Broadcom tags, so the only way to segregate traffic is to have VLANs for that. Are you doing this for port separation? To keep lan1-4 traffic separate from wan? DSA does that by default, no vlan needed. So you can just do ip link add name br0 type bridge ip link set dev br0 up ip link set dev lan1 master br0 ip link set dev lan2 master br0 ip link set dev lan3 master br0 ip link set dev lan4 master br0 and use interface wan directly, no bridge needed. That would work once Broadcom tags are turned on which requires turning on managed mode, which requires work that I have not been able to get done :) Setup with swconfig: #!/usr/bin/bash INTERFACE=eth0 # Delete all IP addresses and get link up ip addr flush dev ${INTERFACE} ip link set ${INTERFACE} up # Lamobo R1 aka BPi R1 Routerboard # # Speaker | LAN1 | LAN2 | LAN3 | LAN4 || LAN5 | HDMI # SW-Port | P2 | P1 | P0 | P4 || P3 | # VLAN | 11 | 12 | 13 | 14 ||ALL(t)| # # Switch-Port P8 - ALL(t) boards internal CPU Port # Setup switch swconfig dev ${INTERFACE} set reset 1 swconfig dev ${INTERFACE} set enable_vlan 1 swconfig dev ${INTERFACE} vlan 101 set ports '3 8t' swconfig dev ${INTERFACE} vlan 102 set ports '4 0 1 2 8t' swconfig dev ${INTERFACE} set apply 1 How to achieve this setup CURRENTLY with DSA? Your first email had the right programming sequence, but you did not answer whether you have CONFIG_BRIDGE_VLAN_FILTERING enabled or not, which is likely your problem. Here are some reference configurations that should work: https://github.com/armbian/build/issues/511#issuecomment-320473246 I know, some comments are from me but none of them worked, therefore on LKML :-) /boot/config-4.16.7-100.fc26.armv7hl:CONFIG_BRIDGE_VLAN_FILTERING=y so this can't be the issue, any further ideas? On my 2nd Banana Pi-R1 still on Fedora 25 with kernel 4.12.8-200.fc25.armv7hl the commands still work well, but I wanted to test the upgrade on another one. /boot/config-4.12.8-200.fc25.armv7hl:CONFIG_BRIDGE_VLAN_FILTERING=y Thnx. Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26
On 22.05.2018 22:42, Florian Fainelli wrote: On 05/22/2018 01:16 PM, Andrew Lunn wrote: Planned network structure will be as with 4.7.x kernels: br0 <=> eth0.101 <=> eth0 (vlan 101 tagged) <=> lan 1-lan4 (vlan 101 untagged pvid) br1 <=> eth0.102 <=> eth0 (vlan 102 tagged) <=> wan (vlan 102 untagged pvid) Do you even need these vlans? Yes, remember, b53 does not currently turn on Broadcom tags, so the only way to segregate traffic is to have VLANs for that. Are you doing this for port separation? To keep lan1-4 traffic separate from wan? DSA does that by default, no vlan needed. So you can just do ip link add name br0 type bridge ip link set dev br0 up ip link set dev lan1 master br0 ip link set dev lan2 master br0 ip link set dev lan3 master br0 ip link set dev lan4 master br0 and use interface wan directly, no bridge needed. That would work once Broadcom tags are turned on which requires turning on managed mode, which requires work that I have not been able to get done :) Setup with swconfig: #!/usr/bin/bash INTERFACE=eth0 # Delete all IP addresses and get link up ip addr flush dev ${INTERFACE} ip link set ${INTERFACE} up # Lamobo R1 aka BPi R1 Routerboard # # Speaker | LAN1 | LAN2 | LAN3 | LAN4 || LAN5 | HDMI # SW-Port | P2 | P1 | P0 | P4 || P3 | # VLAN | 11 | 12 | 13 | 14 ||ALL(t)| # # Switch-Port P8 - ALL(t) boards internal CPU Port # Setup switch swconfig dev ${INTERFACE} set reset 1 swconfig dev ${INTERFACE} set enable_vlan 1 swconfig dev ${INTERFACE} vlan 101 set ports '3 8t' swconfig dev ${INTERFACE} vlan 102 set ports '4 0 1 2 8t' swconfig dev ${INTERFACE} set apply 1 How to achieve this setup CURRENTLY with DSA? And in the future (time plan)? Thank you. Ciao, Gerhard
Re: B53 DSA switch problem on Banana Pi-R1 on Fedora 26
On 22.05.2018 22:42, Florian Fainelli wrote: On 05/22/2018 01:16 PM, Andrew Lunn wrote: Planned network structure will be as with 4.7.x kernels: br0 <=> eth0.101 <=> eth0 (vlan 101 tagged) <=> lan 1-lan4 (vlan 101 untagged pvid) br1 <=> eth0.102 <=> eth0 (vlan 102 tagged) <=> wan (vlan 102 untagged pvid) Do you even need these vlans? Yes, remember, b53 does not currently turn on Broadcom tags, so the only way to segregate traffic is to have VLANs for that. Are you doing this for port separation? To keep lan1-4 traffic separate from wan? DSA does that by default, no vlan needed. So you can just do ip link add name br0 type bridge ip link set dev br0 up ip link set dev lan1 master br0 ip link set dev lan2 master br0 ip link set dev lan3 master br0 ip link set dev lan4 master br0 and use interface wan directly, no bridge needed. That would work once Broadcom tags are turned on which requires turning on managed mode, which requires work that I have not been able to get done :) Setup with swconfig: #!/usr/bin/bash INTERFACE=eth0 # Delete all IP addresses and get link up ip addr flush dev ${INTERFACE} ip link set ${INTERFACE} up # Lamobo R1 aka BPi R1 Routerboard # # Speaker | LAN1 | LAN2 | LAN3 | LAN4 || LAN5 | HDMI # SW-Port | P2 | P1 | P0 | P4 || P3 | # VLAN | 11 | 12 | 13 | 14 ||ALL(t)| # # Switch-Port P8 - ALL(t) boards internal CPU Port # Setup switch swconfig dev ${INTERFACE} set reset 1 swconfig dev ${INTERFACE} set enable_vlan 1 swconfig dev ${INTERFACE} vlan 101 set ports '3 8t' swconfig dev ${INTERFACE} vlan 102 set ports '4 0 1 2 8t' swconfig dev ${INTERFACE} set apply 1 How to achieve this setup CURRENTLY with DSA? And in the future (time plan)? Thank you. Ciao, Gerhard
B53 DSA switch problem on Banana Pi-R1 on Fedora 26
Hello, I'm trying to get B53 DSA switch working on the Banana Pi-R1 on Fedora 26 to run (I will upgrade to Fedora 27 and Fedora 28 when networking works again). Previously the switch was configured with swconfig without any problems. Kernel: 4.16.7-100.fc26.armv7hl b53_common: found switch: BCM53125, rev 4 I see all interfaces: lan1 to lan4 and wan. i get the following error messages: # master and self, same results bridge vlan add dev lan1 vid 101 pvid untagged self RTNETLINK answers: Operation not supported bridge vlan add dev lan2 vid 101 pvid untagged self RTNETLINK answers: Operation not supported bridge vlan add dev lan3 vid 101 pvid untagged self RTNETLINK answers: Operation not supported bridge vlan add dev lan4 vid 101 pvid untagged self RTNETLINK answers: Operation not supported # No quite sure here regarding CPU interface and VLAN, because this changed with some patches, also from dsa.txt bridge vlan add dev eth0 vid 101 self RTNETLINK answers: Operation not supported Planned network structure will be as with 4.7.x kernels: br0 <=> eth0.101 <=> eth0 (vlan 101 tagged) <=> lan 1-lan4 (vlan 101 untagged pvid) br1 <=> eth0.102 <=> eth0 (vlan 102 tagged) <=> wan (vlan 102 untagged pvid) I think the rest of the config is clear after some research now, but I provide details if that one worked well. If necessary I can provide full commands & logs and further details. Thank you. Any ideas? Ciao, Gerhard
B53 DSA switch problem on Banana Pi-R1 on Fedora 26
Hello, I'm trying to get B53 DSA switch working on the Banana Pi-R1 on Fedora 26 to run (I will upgrade to Fedora 27 and Fedora 28 when networking works again). Previously the switch was configured with swconfig without any problems. Kernel: 4.16.7-100.fc26.armv7hl b53_common: found switch: BCM53125, rev 4 I see all interfaces: lan1 to lan4 and wan. i get the following error messages: # master and self, same results bridge vlan add dev lan1 vid 101 pvid untagged self RTNETLINK answers: Operation not supported bridge vlan add dev lan2 vid 101 pvid untagged self RTNETLINK answers: Operation not supported bridge vlan add dev lan3 vid 101 pvid untagged self RTNETLINK answers: Operation not supported bridge vlan add dev lan4 vid 101 pvid untagged self RTNETLINK answers: Operation not supported # No quite sure here regarding CPU interface and VLAN, because this changed with some patches, also from dsa.txt bridge vlan add dev eth0 vid 101 self RTNETLINK answers: Operation not supported Planned network structure will be as with 4.7.x kernels: br0 <=> eth0.101 <=> eth0 (vlan 101 tagged) <=> lan 1-lan4 (vlan 101 untagged pvid) br1 <=> eth0.102 <=> eth0 (vlan 102 tagged) <=> wan (vlan 102 untagged pvid) I think the rest of the config is clear after some research now, but I provide details if that one worked well. If necessary I can provide full commands & logs and further details. Thank you. Any ideas? Ciao, Gerhard
Re: [Qemu-devel] kvm_intel fails to load on Conroe CPUs running Linux 4.12
On 15.09.2017 19:07, Paolo Bonzini wrote: On 15/09/2017 16:43, Gerhard Wiesinger wrote: On 27.08.2017 20:55, Paolo Bonzini wrote: Il 27 ago 2017 4:48 PM, "Gerhard Wiesinger" <li...@wiesinger.com <mailto:li...@wiesinger.com>> ha scritto: On 27.08.2017 14 <tel:27.08.2017%2014>:03, Paolo Bonzini wrote: We will revert the patch, but 4.13.0 will not have the fix. Expect it in later stable kernels (because vacations). Thnx. Why will 4.13.0 NOT have the fix? Because maintainers are on vacation! :-) Hello Paolo, Any update on this for 4.12 and 4.13 kernels? A late fix is better than a wrong fix. Hope to get to it next week! Hello Paolo, Any update? Thnx. Ciao, Gerhard
Re: [Qemu-devel] kvm_intel fails to load on Conroe CPUs running Linux 4.12
On 15.09.2017 19:07, Paolo Bonzini wrote: On 15/09/2017 16:43, Gerhard Wiesinger wrote: On 27.08.2017 20:55, Paolo Bonzini wrote: Il 27 ago 2017 4:48 PM, "Gerhard Wiesinger" mailto:li...@wiesinger.com>> ha scritto: On 27.08.2017 14 :03, Paolo Bonzini wrote: We will revert the patch, but 4.13.0 will not have the fix. Expect it in later stable kernels (because vacations). Thnx. Why will 4.13.0 NOT have the fix? Because maintainers are on vacation! :-) Hello Paolo, Any update on this for 4.12 and 4.13 kernels? A late fix is better than a wrong fix. Hope to get to it next week! Hello Paolo, Any update? Thnx. Ciao, Gerhard
Re: [Qemu-devel] kvm_intel fails to load on Conroe CPUs running Linux 4.12
On 27.08.2017 14:03, Paolo Bonzini wrote: Il 27 ago 2017 9:49 AM, "Gerhard Wiesinger" <li...@wiesinger.com> ha scritto: On 17.08.2017 23:14, Gerhard Wiesinger wrote: On 17.08.2017 22:58, Gerhard Wiesinger wrote: On 07.08.2017 19:50, Paolo Bonzini wrote: Not much to say, unfortunately. It's pretty much the same capabilities as a Prescott/Cedar Mill processor, except that it has MSR bitmaps. It also lacks FlexPriority compared to the Conroe I had checked. It's not great that even the revert patch doesn't apply cleanly---this is *not* necessarily a boring area of the hypervisor... Given the rarity of your machine I'm currently leaning towards _not_ reverting the change. I'll check another non-Xeon Core 2 tomorrow that is from December 2008 (IIRC). If that one also lacks vNMI, or if I get other reports, I suppose I will have to reconsider that. Hello Paolo, Can you please revert the patch. CPU is a Core 2 Extreme QX6700: SL9UL (B3) running VERY stable with ECC RAM for years now. https://ark.intel.com/products/28028/Intel-Core2-Extreme- Processor-QX6700-8M-Cache-2_66-GHz-1066-MHz-FSB?q=Core% 202%20Extreme%20QX6700 https://en.wikipedia.org/wiki/List_of_Intel_Core_2_microprocessors CPU details below. Thank you. Ciao, Gerhard Hello Paolo, Any update on this major issue? We will revert the patch, but 4.13.0 will not have the fix. Expect it in later stable kernels (because vacations). Thnx. Why will 4.13.0 NOT have the fix? Thnx. Ciao, Gerhard
Re: [Qemu-devel] kvm_intel fails to load on Conroe CPUs running Linux 4.12
On 27.08.2017 14:03, Paolo Bonzini wrote: Il 27 ago 2017 9:49 AM, "Gerhard Wiesinger" ha scritto: On 17.08.2017 23:14, Gerhard Wiesinger wrote: On 17.08.2017 22:58, Gerhard Wiesinger wrote: On 07.08.2017 19:50, Paolo Bonzini wrote: Not much to say, unfortunately. It's pretty much the same capabilities as a Prescott/Cedar Mill processor, except that it has MSR bitmaps. It also lacks FlexPriority compared to the Conroe I had checked. It's not great that even the revert patch doesn't apply cleanly---this is *not* necessarily a boring area of the hypervisor... Given the rarity of your machine I'm currently leaning towards _not_ reverting the change. I'll check another non-Xeon Core 2 tomorrow that is from December 2008 (IIRC). If that one also lacks vNMI, or if I get other reports, I suppose I will have to reconsider that. Hello Paolo, Can you please revert the patch. CPU is a Core 2 Extreme QX6700: SL9UL (B3) running VERY stable with ECC RAM for years now. https://ark.intel.com/products/28028/Intel-Core2-Extreme- Processor-QX6700-8M-Cache-2_66-GHz-1066-MHz-FSB?q=Core% 202%20Extreme%20QX6700 https://en.wikipedia.org/wiki/List_of_Intel_Core_2_microprocessors CPU details below. Thank you. Ciao, Gerhard Hello Paolo, Any update on this major issue? We will revert the patch, but 4.13.0 will not have the fix. Expect it in later stable kernels (because vacations). Thnx. Why will 4.13.0 NOT have the fix? Thnx. Ciao, Gerhard
Re: [Qemu-devel] kvm_intel fails to load on Conroe CPUs running Linux 4.12
On 17.08.2017 23:14, Gerhard Wiesinger wrote: On 17.08.2017 22:58, Gerhard Wiesinger wrote: > > On 07.08.2017 19:50, Paolo Bonzini wrote: > > >Not much to say, unfortunately. It's pretty much the same capabilities > >as a Prescott/Cedar Mill processor, except that it has MSR bitmaps. It > >also lacks FlexPriority compared to the Conroe I had checked. > > > >It's not great that even the revert patch doesn't apply cleanly---this > >is *not* necessarily a boring area of the hypervisor... > > > >Given the rarity of your machine I'm currently leaning towards _not_ > >reverting the change. I'll check another non-Xeon Core 2 tomorrow that > >is from December 2008 (IIRC). If that one also lacks vNMI, or if I get > >other reports, I suppose I will have to reconsider that. Hello Paolo, Can you please revert the patch. CPU is a Core 2 Extreme QX6700: SL9UL (B3) running VERY stable with ECC RAM for years now. https://ark.intel.com/products/28028/Intel-Core2-Extreme-Processor-QX6700-8M-Cache-2_66-GHz-1066-MHz-FSB?q=Core%202%20Extreme%20QX6700 https://en.wikipedia.org/wiki/List_of_Intel_Core_2_microprocessors CPU details below. Thank you. Ciao, Gerhard Hello Paolo, Any update on this major issue? Thnx. Ciao, Gerhard
Re: [Qemu-devel] kvm_intel fails to load on Conroe CPUs running Linux 4.12
On 17.08.2017 23:14, Gerhard Wiesinger wrote: On 17.08.2017 22:58, Gerhard Wiesinger wrote: > > On 07.08.2017 19:50, Paolo Bonzini wrote: > > >Not much to say, unfortunately. It's pretty much the same capabilities > >as a Prescott/Cedar Mill processor, except that it has MSR bitmaps. It > >also lacks FlexPriority compared to the Conroe I had checked. > > > >It's not great that even the revert patch doesn't apply cleanly---this > >is *not* necessarily a boring area of the hypervisor... > > > >Given the rarity of your machine I'm currently leaning towards _not_ > >reverting the change. I'll check another non-Xeon Core 2 tomorrow that > >is from December 2008 (IIRC). If that one also lacks vNMI, or if I get > >other reports, I suppose I will have to reconsider that. Hello Paolo, Can you please revert the patch. CPU is a Core 2 Extreme QX6700: SL9UL (B3) running VERY stable with ECC RAM for years now. https://ark.intel.com/products/28028/Intel-Core2-Extreme-Processor-QX6700-8M-Cache-2_66-GHz-1066-MHz-FSB?q=Core%202%20Extreme%20QX6700 https://en.wikipedia.org/wiki/List_of_Intel_Core_2_microprocessors CPU details below. Thank you. Ciao, Gerhard Hello Paolo, Any update on this major issue? Thnx. Ciao, Gerhard
Re: kvm_intel fails to load on Conroe CPUs running Linux 4.12
On 17.08.2017 22:58, Gerhard Wiesinger wrote: > > On 07.08.2017 19:50, Paolo Bonzini wrote: > > >Not much to say, unfortunately. It's pretty much the same capabilities > >as a Prescott/Cedar Mill processor, except that it has MSR bitmaps. It > >also lacks FlexPriority compared to the Conroe I had checked. > > > >It's not great that even the revert patch doesn't apply cleanly---this > >is *not* necessarily a boring area of the hypervisor... > > > >Given the rarity of your machine I'm currently leaning towards _not_ > >reverting the change. I'll check another non-Xeon Core 2 tomorrow that > >is from December 2008 (IIRC). If that one also lacks vNMI, or if I get > >other reports, I suppose I will have to reconsider that. Hello Paolo, Can you please revert the patch. CPU is a Core 2 Extreme QX6700: SL9UL (B3) running VERY stable with ECC RAM for years now. https://ark.intel.com/products/28028/Intel-Core2-Extreme-Processor-QX6700-8M-Cache-2_66-GHz-1066-MHz-FSB?q=Core%202%20Extreme%20QX6700 https://en.wikipedia.org/wiki/List_of_Intel_Core_2_microprocessors CPU details below. Thank you. Ciao, Gerhard cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Quad CPU @ 2.66GHz stepping : 7 microcode : 0x6a cpu MHz : 1596.000 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm tpr_shadow dtherm bugs : bogomips : 5333.45 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: Script output: Basic VMX Information Hex: 0x1a0407 Revision 7 VMCS size 1024 VMCS restricted to 32 bit addresses no Dual-monitor support yes VMCS memory type 6 INS/OUTS instruction information no IA32_VMX_TRUE_*_CTLS support no pin-based controls External interrupt exiting yes NMI exiting yes Virtual NMIs no Activate VMX-preemption timer no Process posted interrupts no primary processor-based controls Interrupt window exiting yes Use TSC offsetting yes HLT exiting yes INVLPG exiting yes MWAIT exiting yes RDPMC exiting yes RDTSC exiting yes CR3-load exiting forced CR3-store exiting forced CR8-load exiting yes CR8-store exiting yes Use TPR shadow yes NMI-window exiting no MOV-DR exiting yes Unconditional I/O exiting yes Use I/O bitmaps yes Monitor trap flag no Use MSR bitmaps yes MONITOR exiting yes PAUSE exiting yes Activate secondary control no secondary processor-based controls Virtualize APIC accesses no Enable EPT no Descriptor-table exiting no Enable RDTSCP no Virtualize x2APIC mode no Enable VPID no WBINVD exiting no Unrestricted guest no APIC register emulation no Virtual interrupt delivery no PAUSE-loop exiting no RDRAND exiting no Enable INVPCID no Enable VM functions no VMCS shadowing no Enable ENCLS exiting no RDSEED exiting no Enable PML no EPT-violation #VE no Conceal non-root operation from PT no Enable XSAVES/XRSTORS no Mode-based execute control (XS/XU) no TSC scaling no VM-Exit controls Save debug controls forced Host address-space size
Re: kvm_intel fails to load on Conroe CPUs running Linux 4.12
On 17.08.2017 22:58, Gerhard Wiesinger wrote: > > On 07.08.2017 19:50, Paolo Bonzini wrote: > > >Not much to say, unfortunately. It's pretty much the same capabilities > >as a Prescott/Cedar Mill processor, except that it has MSR bitmaps. It > >also lacks FlexPriority compared to the Conroe I had checked. > > > >It's not great that even the revert patch doesn't apply cleanly---this > >is *not* necessarily a boring area of the hypervisor... > > > >Given the rarity of your machine I'm currently leaning towards _not_ > >reverting the change. I'll check another non-Xeon Core 2 tomorrow that > >is from December 2008 (IIRC). If that one also lacks vNMI, or if I get > >other reports, I suppose I will have to reconsider that. Hello Paolo, Can you please revert the patch. CPU is a Core 2 Extreme QX6700: SL9UL (B3) running VERY stable with ECC RAM for years now. https://ark.intel.com/products/28028/Intel-Core2-Extreme-Processor-QX6700-8M-Cache-2_66-GHz-1066-MHz-FSB?q=Core%202%20Extreme%20QX6700 https://en.wikipedia.org/wiki/List_of_Intel_Core_2_microprocessors CPU details below. Thank you. Ciao, Gerhard cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Quad CPU @ 2.66GHz stepping : 7 microcode : 0x6a cpu MHz : 1596.000 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm tpr_shadow dtherm bugs : bogomips : 5333.45 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: Script output: Basic VMX Information Hex: 0x1a0407 Revision 7 VMCS size 1024 VMCS restricted to 32 bit addresses no Dual-monitor support yes VMCS memory type 6 INS/OUTS instruction information no IA32_VMX_TRUE_*_CTLS support no pin-based controls External interrupt exiting yes NMI exiting yes Virtual NMIs no Activate VMX-preemption timer no Process posted interrupts no primary processor-based controls Interrupt window exiting yes Use TSC offsetting yes HLT exiting yes INVLPG exiting yes MWAIT exiting yes RDPMC exiting yes RDTSC exiting yes CR3-load exiting forced CR3-store exiting forced CR8-load exiting yes CR8-store exiting yes Use TPR shadow yes NMI-window exiting no MOV-DR exiting yes Unconditional I/O exiting yes Use I/O bitmaps yes Monitor trap flag no Use MSR bitmaps yes MONITOR exiting yes PAUSE exiting yes Activate secondary control no secondary processor-based controls Virtualize APIC accesses no Enable EPT no Descriptor-table exiting no Enable RDTSCP no Virtualize x2APIC mode no Enable VPID no WBINVD exiting no Unrestricted guest no APIC register emulation no Virtual interrupt delivery no PAUSE-loop exiting no RDRAND exiting no Enable INVPCID no Enable VM functions no VMCS shadowing no Enable ENCLS exiting no RDSEED exiting no Enable PML no EPT-violation #VE no Conceal non-root operation from PT no Enable XSAVES/XRSTORS no Mode-based execute control (XS/XU) no TSC scaling no VM-Exit controls Save debug controls forced Host address-space size
Re: Still OOM problems with 4.9er/4.10er kernels
On 23.03.2017 09:38, Mike Galbraith wrote: On Thu, 2017-03-23 at 08:16 +0100, Gerhard Wiesinger wrote: On 21.03.2017 08:13, Mike Galbraith wrote: On Tue, 2017-03-21 at 06:59 +0100, Gerhard Wiesinger wrote: Is this the correct information? Incomplete, but enough to reiterate cgroup_disable=memory suggestion. How to collect complete information? If Michal wants specifics, I suspect he'll ask. I posted only to pass along a speck of information, and offer a test suggestion.. twice. Still OOM with cgroup_disable=memory, kernel 4.11.0-0.rc3.git0.2.fc26.x86_64,I set vm.min_free_kbytes = 10240 in these tests. # Full config grep "vm\." /etc/sysctl.d/* /etc/sysctl.d/00-dirty_background_ratio.conf:vm.dirty_background_ratio = 3 /etc/sysctl.d/00-dirty_ratio.conf:vm.dirty_ratio = 15 /etc/sysctl.d/00-kernel-vm-min-free-kbyzes.conf:vm.min_free_kbytes = 10240 /etc/sysctl.d/00-overcommit_memory.conf:vm.overcommit_memory = 2 /etc/sysctl.d/00-overcommit_ratio.conf:vm.overcommit_ratio = 80 /etc/sysctl.d/00-swappiness.conf:vm.swappiness=10 [31880.623557] sa1: page allocation stalls for 10942ms, order:0, mode:0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null) [31880.623623] sa1 cpuset=/ mems_allowed=0 [31880.623630] CPU: 1 PID: 17112 Comm: sa1 Not tainted 4.11.0-0.rc3.git0.2.fc26.x86_64 #1 [31880.623724] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 04/01/2014 [31880.623819] Call Trace: [31880.623893] dump_stack+0x63/0x84 [31880.623971] warn_alloc+0x10c/0x1b0 [31880.624046] __alloc_pages_slowpath+0x93d/0xe60 [31880.624142] ? get_page_from_freelist+0x122/0xbf0 [31880.624225] ? unmap_region+0xf7/0x130 [31880.624312] __alloc_pages_nodemask+0x290/0x2b0 [31880.624388] alloc_pages_vma+0xa0/0x2b0 [31880.624463] __handle_mm_fault+0x4d0/0x1160 [31880.624550] handle_mm_fault+0xb3/0x250 [31880.624628] __do_page_fault+0x23f/0x4c0 [31880.624701] trace_do_page_fault+0x41/0x120 [31880.624781] do_async_page_fault+0x51/0xa0 [31880.624866] async_page_fault+0x28/0x30 [31880.624941] RIP: 0033:0x7f9218d4914f [31880.625032] RSP: 002b:7ffe0d1376a8 EFLAGS: 00010206 [31880.625153] RAX: 7f9218d2a314 RBX: 7f9218f4e658 RCX: 7f9218d2a354 [31880.625235] RDX: 05ec RSI: RDI: 7f9218d2a314 [31880.625324] RBP: 7ffe0d137950 R08: 7f9218d2a900 R09: 00027000 [31880.625423] R10: 7ffe0d1376e0 R11: 7f9218d2a900 R12: 0003 [31880.625505] R13: 7ffe0d137a38 R14: fd01 R15: 0002 [31880.625688] Mem-Info: [31880.625762] active_anon:36671 inactive_anon:36711 isolated_anon:88 active_file:1399 inactive_file:1410 isolated_file:0 unevictable:0 dirty:5 writeback:15 unstable:0 slab_reclaimable:3099 slab_unreclaimable:3558 mapped:2037 shmem:3 pagetables:3340 bounce:0 free:2972 free_pcp:102 free_cma:0 [31880.627334] Node 0 active_anon:146684kB inactive_anon:146816kB active_file:5596kB inactive_file:5572kB unevictable:0kB isolated(anon):368kB isolated(file):0kB mapped:8044kB dirty:20kB writeback:136kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 12kB writeback_tmp:0kB unstable:0kB pages_scanned:82 all_unreclaimable? no [31880.627606] Node 0 DMA free:1816kB min:440kB low:548kB high:656kB active_anon:5636kB inactive_anon:6844kB active_file:132kB inactive_file:148kB unevictable:0kB writepending:4kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:284kB slab_unreclaimable:532kB kernel_stack:0kB pagetables:188kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [31880.627883] lowmem_reserve[]: 0 327 327 327 327 [31880.627959] Node 0 DMA32 free:10072kB min:9796kB low:12244kB high:14692kB active_anon:141048kB inactive_anon:14kB active_file:5432kB inactive_file:5444kB unevictable:0kB writepending:152kB present:376688kB managed:353760kB mlocked:0kB slab_reclaimable:12112kB slab_unreclaimable:13700kB kernel_stack:2464kB pagetables:13172kB bounce:0kB free_pcp:504kB local_pcp:272kB free_cma:0kB [31880.628334] lowmem_reserve[]: 0 0 0 0 0 [31880.629882] Node 0 DMA: 33*4kB (UME) 24*8kB (UM) 26*16kB (UME) 4*32kB (UME) 5*64kB (UME) 1*128kB (E) 2*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1828kB [31880.632255] Node 0 DMA32: 174*4kB (UMEH) 107*8kB (UMEH) 96*16kB (UMEH) 59*32kB (UME) 30*64kB (UMEH) 8*128kB (UEH) 8*256kB (UMEH) 1*512kB (E) 0*1024kB 0*2048kB 0*4096kB = 10480kB [31880.634344] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [31880.634346] 7276 total pagecache pages [31880.635277] 4367 pages in swap cache [31880.636206] Swap cache stats: add 563, delete 5635551, find 6573228/8496821 [31880.637145] Free swap = 973736kB [31880.638038] Total swap = 2064380kB [31880.638988] 98170 pages RAM [31880.640309] 0 pages HighMem/MovableOnly [31880.641791] 5753 pages reserved [31880.642908] 0 pages cma reserved [31880.643978] 0 pages hwpoisoned Wil
Re: Still OOM problems with 4.9er/4.10er kernels
On 23.03.2017 09:38, Mike Galbraith wrote: On Thu, 2017-03-23 at 08:16 +0100, Gerhard Wiesinger wrote: On 21.03.2017 08:13, Mike Galbraith wrote: On Tue, 2017-03-21 at 06:59 +0100, Gerhard Wiesinger wrote: Is this the correct information? Incomplete, but enough to reiterate cgroup_disable=memory suggestion. How to collect complete information? If Michal wants specifics, I suspect he'll ask. I posted only to pass along a speck of information, and offer a test suggestion.. twice. Still OOM with cgroup_disable=memory, kernel 4.11.0-0.rc3.git0.2.fc26.x86_64,I set vm.min_free_kbytes = 10240 in these tests. # Full config grep "vm\." /etc/sysctl.d/* /etc/sysctl.d/00-dirty_background_ratio.conf:vm.dirty_background_ratio = 3 /etc/sysctl.d/00-dirty_ratio.conf:vm.dirty_ratio = 15 /etc/sysctl.d/00-kernel-vm-min-free-kbyzes.conf:vm.min_free_kbytes = 10240 /etc/sysctl.d/00-overcommit_memory.conf:vm.overcommit_memory = 2 /etc/sysctl.d/00-overcommit_ratio.conf:vm.overcommit_ratio = 80 /etc/sysctl.d/00-swappiness.conf:vm.swappiness=10 [31880.623557] sa1: page allocation stalls for 10942ms, order:0, mode:0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null) [31880.623623] sa1 cpuset=/ mems_allowed=0 [31880.623630] CPU: 1 PID: 17112 Comm: sa1 Not tainted 4.11.0-0.rc3.git0.2.fc26.x86_64 #1 [31880.623724] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 04/01/2014 [31880.623819] Call Trace: [31880.623893] dump_stack+0x63/0x84 [31880.623971] warn_alloc+0x10c/0x1b0 [31880.624046] __alloc_pages_slowpath+0x93d/0xe60 [31880.624142] ? get_page_from_freelist+0x122/0xbf0 [31880.624225] ? unmap_region+0xf7/0x130 [31880.624312] __alloc_pages_nodemask+0x290/0x2b0 [31880.624388] alloc_pages_vma+0xa0/0x2b0 [31880.624463] __handle_mm_fault+0x4d0/0x1160 [31880.624550] handle_mm_fault+0xb3/0x250 [31880.624628] __do_page_fault+0x23f/0x4c0 [31880.624701] trace_do_page_fault+0x41/0x120 [31880.624781] do_async_page_fault+0x51/0xa0 [31880.624866] async_page_fault+0x28/0x30 [31880.624941] RIP: 0033:0x7f9218d4914f [31880.625032] RSP: 002b:7ffe0d1376a8 EFLAGS: 00010206 [31880.625153] RAX: 7f9218d2a314 RBX: 7f9218f4e658 RCX: 7f9218d2a354 [31880.625235] RDX: 05ec RSI: RDI: 7f9218d2a314 [31880.625324] RBP: 7ffe0d137950 R08: 7f9218d2a900 R09: 00027000 [31880.625423] R10: 7ffe0d1376e0 R11: 7f9218d2a900 R12: 0003 [31880.625505] R13: 7ffe0d137a38 R14: fd01 R15: 0002 [31880.625688] Mem-Info: [31880.625762] active_anon:36671 inactive_anon:36711 isolated_anon:88 active_file:1399 inactive_file:1410 isolated_file:0 unevictable:0 dirty:5 writeback:15 unstable:0 slab_reclaimable:3099 slab_unreclaimable:3558 mapped:2037 shmem:3 pagetables:3340 bounce:0 free:2972 free_pcp:102 free_cma:0 [31880.627334] Node 0 active_anon:146684kB inactive_anon:146816kB active_file:5596kB inactive_file:5572kB unevictable:0kB isolated(anon):368kB isolated(file):0kB mapped:8044kB dirty:20kB writeback:136kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 12kB writeback_tmp:0kB unstable:0kB pages_scanned:82 all_unreclaimable? no [31880.627606] Node 0 DMA free:1816kB min:440kB low:548kB high:656kB active_anon:5636kB inactive_anon:6844kB active_file:132kB inactive_file:148kB unevictable:0kB writepending:4kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:284kB slab_unreclaimable:532kB kernel_stack:0kB pagetables:188kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [31880.627883] lowmem_reserve[]: 0 327 327 327 327 [31880.627959] Node 0 DMA32 free:10072kB min:9796kB low:12244kB high:14692kB active_anon:141048kB inactive_anon:14kB active_file:5432kB inactive_file:5444kB unevictable:0kB writepending:152kB present:376688kB managed:353760kB mlocked:0kB slab_reclaimable:12112kB slab_unreclaimable:13700kB kernel_stack:2464kB pagetables:13172kB bounce:0kB free_pcp:504kB local_pcp:272kB free_cma:0kB [31880.628334] lowmem_reserve[]: 0 0 0 0 0 [31880.629882] Node 0 DMA: 33*4kB (UME) 24*8kB (UM) 26*16kB (UME) 4*32kB (UME) 5*64kB (UME) 1*128kB (E) 2*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1828kB [31880.632255] Node 0 DMA32: 174*4kB (UMEH) 107*8kB (UMEH) 96*16kB (UMEH) 59*32kB (UME) 30*64kB (UMEH) 8*128kB (UEH) 8*256kB (UMEH) 1*512kB (E) 0*1024kB 0*2048kB 0*4096kB = 10480kB [31880.634344] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [31880.634346] 7276 total pagecache pages [31880.635277] 4367 pages in swap cache [31880.636206] Swap cache stats: add 563, delete 5635551, find 6573228/8496821 [31880.637145] Free swap = 973736kB [31880.638038] Total swap = 2064380kB [31880.638988] 98170 pages RAM [31880.640309] 0 pages HighMem/MovableOnly [31880.641791] 5753 pages reserved [31880.642908] 0 pages cma reserved [31880.643978] 0 pages hwpoisoned Wil
Re: Still OOM problems with 4.9er/4.10er kernels
On 21.03.2017 08:13, Mike Galbraith wrote: On Tue, 2017-03-21 at 06:59 +0100, Gerhard Wiesinger wrote: Is this the correct information? Incomplete, but enough to reiterate cgroup_disable=memory suggestion. How to collect complete information? Thnx. Ciao, Gerhard
Re: Still OOM problems with 4.9er/4.10er kernels
On 21.03.2017 08:13, Mike Galbraith wrote: On Tue, 2017-03-21 at 06:59 +0100, Gerhard Wiesinger wrote: Is this the correct information? Incomplete, but enough to reiterate cgroup_disable=memory suggestion. How to collect complete information? Thnx. Ciao, Gerhard
Re: Still OOM problems with 4.9er/4.10er kernels
On 20.03.2017 04:05, Mike Galbraith wrote: On Sun, 2017-03-19 at 17:02 +0100, Gerhard Wiesinger wrote: mount | grep cgroup Just because controllers are mounted doesn't mean they're populated. To check that, you want to look for directories under the mount points with a non-empty 'tasks'. You will find some, but memory cgroup assignments would likely be most interesting for this thread. You can eliminate any diddling there by booting with cgroup_disable=memory. Is this the correct information? mount | grep "type cgroup" | cut -f 3 -d ' ' | while read LINE; do echo "";echo ${LINE};ls -l ${LINE}; done /sys/fs/cgroup/systemd total 0 -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.clone_children -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.procs -r--r--r-- 1 root root 0 Mar 20 14:31 cgroup.sane_behavior drwxr-xr-x 2 root root 0 Mar 20 14:31 init.scope -rw-r--r-- 1 root root 0 Mar 20 14:31 notify_on_release -rw-r--r-- 1 root root 0 Mar 20 14:31 release_agent drwxr-xr-x 60 root root 0 Mar 21 06:50 system.slice -rw-r--r-- 1 root root 0 Mar 20 14:31 tasks drwxr-xr-x 4 root root 0 Mar 21 06:55 user.slice /sys/fs/cgroup/net_cls,net_prio total 0 -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.clone_children -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.procs -r--r--r-- 1 root root 0 Mar 20 14:31 cgroup.sane_behavior -rw-r--r-- 1 root root 0 Mar 20 14:31 net_cls.classid -rw-r--r-- 1 root root 0 Mar 20 14:31 net_prio.ifpriomap -r--r--r-- 1 root root 0 Mar 20 14:31 net_prio.prioidx -rw-r--r-- 1 root root 0 Mar 20 14:31 notify_on_release -rw-r--r-- 1 root root 0 Mar 20 14:31 release_agent -rw-r--r-- 1 root root 0 Mar 20 14:31 tasks /sys/fs/cgroup/cpu,cpuacct total 0 -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.clone_children -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.procs -r--r--r-- 1 root root 0 Mar 20 14:31 cgroup.sane_behavior -r--r--r-- 1 root root 0 Mar 20 14:31 cpuacct.stat -rw-r--r-- 1 root root 0 Mar 20 14:31 cpuacct.usage -r--r--r-- 1 root root 0 Mar 20 14:31 cpuacct.usage_all -r--r--r-- 1 root root 0 Mar 20 14:31 cpuacct.usage_percpu -r--r--r-- 1 root root 0 Mar 20 14:31 cpuacct.usage_percpu_sys -r--r--r-- 1 root root 0 Mar 20 14:31 cpuacct.usage_percpu_user -r--r--r-- 1 root root 0 Mar 20 14:31 cpuacct.usage_sys -r--r--r-- 1 root root 0 Mar 20 14:31 cpuacct.usage_user -rw-r--r-- 1 root root 0 Mar 20 14:31 cpu.cfs_period_us -rw-r--r-- 1 root root 0 Mar 20 14:31 cpu.cfs_quota_us -rw-r--r-- 1 root root 0 Mar 20 14:31 cpu.shares -r--r--r-- 1 root root 0 Mar 20 14:31 cpu.stat drwxr-xr-x 2 root root 0 Mar 20 14:31 init.scope -rw-r--r-- 1 root root 0 Mar 20 14:31 notify_on_release -rw-r--r-- 1 root root 0 Mar 20 14:31 release_agent drwxr-xr-x 2 root root 0 Mar 20 14:31 system.slice -rw-r--r-- 1 root root 0 Mar 20 14:31 tasks drwxr-xr-x 4 root root 0 Mar 21 06:55 user.slice /sys/fs/cgroup/devices total 0 -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.clone_children -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.procs -r--r--r-- 1 root root 0 Mar 20 14:31 cgroup.sane_behavior --w--- 1 root root 0 Mar 20 14:31 devices.allow --w--- 1 root root 0 Mar 20 14:31 devices.deny -r--r--r-- 1 root root 0 Mar 20 14:31 devices.list drwxr-xr-x 2 root root 0 Mar 20 14:31 init.scope -rw-r--r-- 1 root root 0 Mar 20 14:31 notify_on_release -rw-r--r-- 1 root root 0 Mar 20 14:31 release_agent drwxr-xr-x 60 root root 0 Mar 21 06:50 system.slice -rw-r--r-- 1 root root 0 Mar 20 14:31 tasks drwxr-xr-x 4 root root 0 Mar 21 06:55 user.slice /sys/fs/cgroup/freezer total 0 -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.clone_children -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.procs -r--r--r-- 1 root root 0 Mar 20 14:31 cgroup.sane_behavior -rw-r--r-- 1 root root 0 Mar 20 14:31 notify_on_release -rw-r--r-- 1 root root 0 Mar 20 14:31 release_agent -rw-r--r-- 1 root root 0 Mar 20 14:31 tasks ===
Re: Still OOM problems with 4.9er/4.10er kernels
On 20.03.2017 04:05, Mike Galbraith wrote: On Sun, 2017-03-19 at 17:02 +0100, Gerhard Wiesinger wrote: mount | grep cgroup Just because controllers are mounted doesn't mean they're populated. To check that, you want to look for directories under the mount points with a non-empty 'tasks'. You will find some, but memory cgroup assignments would likely be most interesting for this thread. You can eliminate any diddling there by booting with cgroup_disable=memory. Is this the correct information? mount | grep "type cgroup" | cut -f 3 -d ' ' | while read LINE; do echo "";echo ${LINE};ls -l ${LINE}; done /sys/fs/cgroup/systemd total 0 -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.clone_children -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.procs -r--r--r-- 1 root root 0 Mar 20 14:31 cgroup.sane_behavior drwxr-xr-x 2 root root 0 Mar 20 14:31 init.scope -rw-r--r-- 1 root root 0 Mar 20 14:31 notify_on_release -rw-r--r-- 1 root root 0 Mar 20 14:31 release_agent drwxr-xr-x 60 root root 0 Mar 21 06:50 system.slice -rw-r--r-- 1 root root 0 Mar 20 14:31 tasks drwxr-xr-x 4 root root 0 Mar 21 06:55 user.slice /sys/fs/cgroup/net_cls,net_prio total 0 -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.clone_children -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.procs -r--r--r-- 1 root root 0 Mar 20 14:31 cgroup.sane_behavior -rw-r--r-- 1 root root 0 Mar 20 14:31 net_cls.classid -rw-r--r-- 1 root root 0 Mar 20 14:31 net_prio.ifpriomap -r--r--r-- 1 root root 0 Mar 20 14:31 net_prio.prioidx -rw-r--r-- 1 root root 0 Mar 20 14:31 notify_on_release -rw-r--r-- 1 root root 0 Mar 20 14:31 release_agent -rw-r--r-- 1 root root 0 Mar 20 14:31 tasks /sys/fs/cgroup/cpu,cpuacct total 0 -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.clone_children -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.procs -r--r--r-- 1 root root 0 Mar 20 14:31 cgroup.sane_behavior -r--r--r-- 1 root root 0 Mar 20 14:31 cpuacct.stat -rw-r--r-- 1 root root 0 Mar 20 14:31 cpuacct.usage -r--r--r-- 1 root root 0 Mar 20 14:31 cpuacct.usage_all -r--r--r-- 1 root root 0 Mar 20 14:31 cpuacct.usage_percpu -r--r--r-- 1 root root 0 Mar 20 14:31 cpuacct.usage_percpu_sys -r--r--r-- 1 root root 0 Mar 20 14:31 cpuacct.usage_percpu_user -r--r--r-- 1 root root 0 Mar 20 14:31 cpuacct.usage_sys -r--r--r-- 1 root root 0 Mar 20 14:31 cpuacct.usage_user -rw-r--r-- 1 root root 0 Mar 20 14:31 cpu.cfs_period_us -rw-r--r-- 1 root root 0 Mar 20 14:31 cpu.cfs_quota_us -rw-r--r-- 1 root root 0 Mar 20 14:31 cpu.shares -r--r--r-- 1 root root 0 Mar 20 14:31 cpu.stat drwxr-xr-x 2 root root 0 Mar 20 14:31 init.scope -rw-r--r-- 1 root root 0 Mar 20 14:31 notify_on_release -rw-r--r-- 1 root root 0 Mar 20 14:31 release_agent drwxr-xr-x 2 root root 0 Mar 20 14:31 system.slice -rw-r--r-- 1 root root 0 Mar 20 14:31 tasks drwxr-xr-x 4 root root 0 Mar 21 06:55 user.slice /sys/fs/cgroup/devices total 0 -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.clone_children -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.procs -r--r--r-- 1 root root 0 Mar 20 14:31 cgroup.sane_behavior --w--- 1 root root 0 Mar 20 14:31 devices.allow --w--- 1 root root 0 Mar 20 14:31 devices.deny -r--r--r-- 1 root root 0 Mar 20 14:31 devices.list drwxr-xr-x 2 root root 0 Mar 20 14:31 init.scope -rw-r--r-- 1 root root 0 Mar 20 14:31 notify_on_release -rw-r--r-- 1 root root 0 Mar 20 14:31 release_agent drwxr-xr-x 60 root root 0 Mar 21 06:50 system.slice -rw-r--r-- 1 root root 0 Mar 20 14:31 tasks drwxr-xr-x 4 root root 0 Mar 21 06:55 user.slice /sys/fs/cgroup/freezer total 0 -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.clone_children -rw-r--r-- 1 root root 0 Mar 20 14:31 cgroup.procs -r--r--r-- 1 root root 0 Mar 20 14:31 cgroup.sane_behavior -rw-r--r-- 1 root root 0 Mar 20 14:31 notify_on_release -rw-r--r-- 1 root root 0 Mar 20 14:31 release_agent -rw-r--r-- 1 root root 0 Mar 20 14:31 tasks ===
Re: Still OOM problems with 4.9er/4.10er kernels
On 19.03.2017 16:18, Michal Hocko wrote: On Fri 17-03-17 21:08:31, Gerhard Wiesinger wrote: On 17.03.2017 18:13, Michal Hocko wrote: On Fri 17-03-17 17:37:48, Gerhard Wiesinger wrote: [...] Why does the kernel prefer to swapin/out and not use a.) the free memory? It will use all the free memory up to min watermark which is set up based on min_free_kbytes. Makes sense, how is /proc/sys/vm/min_free_kbytes default value calculated? See init_per_zone_wmark_min b.) the buffer/cache? the memory reclaim is strongly biased towards page cache and we try to avoid swapout as much as possible (see get_scan_count). If I understand it correctly, swapping is preferred over dropping the cache, right. Can this behaviour be changed to prefer dropping the cache to some minimum amount? Is this also configurable in a way? No, we enforce swapping if the amount of free + file pages are below the cumulative high watermark. (As far as I remember e.g. kernel 2.4 dropped the caches well). There is ~100M memory available but kernel swaps all the time ... Any ideas? Kernel: 4.9.14-200.fc25.x86_64 top - 17:33:43 up 28 min, 3 users, load average: 3.58, 1.67, 0.89 Tasks: 145 total, 4 running, 141 sleeping, 0 stopped, 0 zombie %Cpu(s): 19.1 us, 56.2 sy, 0.0 ni, 4.3 id, 13.4 wa, 2.0 hi, 0.3 si, 4.7 st KiB Mem : 230076 total,61508 free, 123472 used,45096 buff/cache procs ---memory-- ---swap-- -io -system-- --cpu- r b swpd free buff cache si sobibo in cs us sy id wa st 3 5 303916 60372328 43864 27828 200 41420 236 6984 11138 11 47 6 23 14 I am really surprised to see any reclaim at all. 26% of free memory doesn't sound as if we should do a reclaim at all. Do you have an unusual configuration of /proc/sys/vm/min_free_kbytes ? Or is there anything running inside a memory cgroup with a small limit? nothing special set regarding /proc/sys/vm/min_free_kbytes (default values), detailed config below. Regarding cgroups, none of I know. How to check (I guess nothing is set because cg* commands are not available)? be careful because systemd started to use some controllers. You can easily check cgroup mount points. See below. /proc/sys/vm/min_free_kbytes 45056 So at least 45M will be kept reserved for the system. Your data indicated you had more memory. How does /proc/zoneinfo look like? Btw. you seem to be using fc kernel, are there any patches applied on top of Linus tree? Could you try to retest vanilla kernel? System looks normally now, FYI (e.g. now permanent swapping) free totalusedfree shared buff/cache available Mem: 349076 154112 41560 184 153404 148716 Swap: 2064380 831844 1232536 cat /proc/zoneinfo Node 0, zone DMA per-node stats nr_inactive_anon 9543 nr_active_anon 22105 nr_inactive_file 9877 nr_active_file 13416 nr_unevictable 0 nr_isolated_anon 0 nr_isolated_file 0 nr_pages_scanned 0 workingset_refault 1926013 workingset_activate 707166 workingset_nodereclaim 187276 nr_anon_pages 11429 nr_mapped6852 nr_file_pages 46772 nr_dirty 1 nr_writeback 0 nr_writeback_temp 0 nr_shmem 46 nr_shmem_hugepages 0 nr_shmem_pmdmapped 0 nr_anon_transparent_hugepages 0 nr_unstable 0 nr_vmscan_write 3319047 nr_vmscan_immediate_reclaim 32363 nr_dirtied 222115 nr_written 3537529 pages free 3110 min 27 low 33 high 39 node_scanned 0 spanned 4095 present 3998 managed 3977 nr_free_pages 3110 nr_zone_inactive_anon 18 nr_zone_active_anon 3 nr_zone_inactive_file 51 nr_zone_active_file 75 nr_zone_unevictable 0 nr_zone_write_pending 0 nr_mlock 0 nr_slab_reclaimable 214 nr_slab_unreclaimable 289 nr_page_table_pages 185 nr_kernel_stack 16 nr_bounce0 nr_zspages 0 numa_hit 1214071 numa_miss0 numa_foreign 0 numa_interleave 0 numa_local 1214071 numa_other 0 nr_free_cma 0 protection: (0, 306, 306, 306, 306) pagesets cpu: 0 count: 0 high: 0 batch: 1 vm stats threshold: 4 cpu: 1 count: 0 high: 0 batch: 1 vm stats threshold: 4 node_unreclaimable: 0 start_pfn: 1 node_inactive_ratio: 0 Node 0, zoneDMA32 pages free 7921 min 546 low 682 high 818 node_scanned 0 spanned 94172 present 94172 managed 83292 nr_free_pages 7921 nr_zone_inactive_anon 9525 nr_zone_active_anon 22102 nr_zone_inactive_file 9826 nr_zone_active_file 13341
Re: Still OOM problems with 4.9er/4.10er kernels
On 19.03.2017 16:18, Michal Hocko wrote: On Fri 17-03-17 21:08:31, Gerhard Wiesinger wrote: On 17.03.2017 18:13, Michal Hocko wrote: On Fri 17-03-17 17:37:48, Gerhard Wiesinger wrote: [...] Why does the kernel prefer to swapin/out and not use a.) the free memory? It will use all the free memory up to min watermark which is set up based on min_free_kbytes. Makes sense, how is /proc/sys/vm/min_free_kbytes default value calculated? See init_per_zone_wmark_min b.) the buffer/cache? the memory reclaim is strongly biased towards page cache and we try to avoid swapout as much as possible (see get_scan_count). If I understand it correctly, swapping is preferred over dropping the cache, right. Can this behaviour be changed to prefer dropping the cache to some minimum amount? Is this also configurable in a way? No, we enforce swapping if the amount of free + file pages are below the cumulative high watermark. (As far as I remember e.g. kernel 2.4 dropped the caches well). There is ~100M memory available but kernel swaps all the time ... Any ideas? Kernel: 4.9.14-200.fc25.x86_64 top - 17:33:43 up 28 min, 3 users, load average: 3.58, 1.67, 0.89 Tasks: 145 total, 4 running, 141 sleeping, 0 stopped, 0 zombie %Cpu(s): 19.1 us, 56.2 sy, 0.0 ni, 4.3 id, 13.4 wa, 2.0 hi, 0.3 si, 4.7 st KiB Mem : 230076 total,61508 free, 123472 used,45096 buff/cache procs ---memory-- ---swap-- -io -system-- --cpu- r b swpd free buff cache si sobibo in cs us sy id wa st 3 5 303916 60372328 43864 27828 200 41420 236 6984 11138 11 47 6 23 14 I am really surprised to see any reclaim at all. 26% of free memory doesn't sound as if we should do a reclaim at all. Do you have an unusual configuration of /proc/sys/vm/min_free_kbytes ? Or is there anything running inside a memory cgroup with a small limit? nothing special set regarding /proc/sys/vm/min_free_kbytes (default values), detailed config below. Regarding cgroups, none of I know. How to check (I guess nothing is set because cg* commands are not available)? be careful because systemd started to use some controllers. You can easily check cgroup mount points. See below. /proc/sys/vm/min_free_kbytes 45056 So at least 45M will be kept reserved for the system. Your data indicated you had more memory. How does /proc/zoneinfo look like? Btw. you seem to be using fc kernel, are there any patches applied on top of Linus tree? Could you try to retest vanilla kernel? System looks normally now, FYI (e.g. now permanent swapping) free totalusedfree shared buff/cache available Mem: 349076 154112 41560 184 153404 148716 Swap: 2064380 831844 1232536 cat /proc/zoneinfo Node 0, zone DMA per-node stats nr_inactive_anon 9543 nr_active_anon 22105 nr_inactive_file 9877 nr_active_file 13416 nr_unevictable 0 nr_isolated_anon 0 nr_isolated_file 0 nr_pages_scanned 0 workingset_refault 1926013 workingset_activate 707166 workingset_nodereclaim 187276 nr_anon_pages 11429 nr_mapped6852 nr_file_pages 46772 nr_dirty 1 nr_writeback 0 nr_writeback_temp 0 nr_shmem 46 nr_shmem_hugepages 0 nr_shmem_pmdmapped 0 nr_anon_transparent_hugepages 0 nr_unstable 0 nr_vmscan_write 3319047 nr_vmscan_immediate_reclaim 32363 nr_dirtied 222115 nr_written 3537529 pages free 3110 min 27 low 33 high 39 node_scanned 0 spanned 4095 present 3998 managed 3977 nr_free_pages 3110 nr_zone_inactive_anon 18 nr_zone_active_anon 3 nr_zone_inactive_file 51 nr_zone_active_file 75 nr_zone_unevictable 0 nr_zone_write_pending 0 nr_mlock 0 nr_slab_reclaimable 214 nr_slab_unreclaimable 289 nr_page_table_pages 185 nr_kernel_stack 16 nr_bounce0 nr_zspages 0 numa_hit 1214071 numa_miss0 numa_foreign 0 numa_interleave 0 numa_local 1214071 numa_other 0 nr_free_cma 0 protection: (0, 306, 306, 306, 306) pagesets cpu: 0 count: 0 high: 0 batch: 1 vm stats threshold: 4 cpu: 1 count: 0 high: 0 batch: 1 vm stats threshold: 4 node_unreclaimable: 0 start_pfn: 1 node_inactive_ratio: 0 Node 0, zoneDMA32 pages free 7921 min 546 low 682 high 818 node_scanned 0 spanned 94172 present 94172 managed 83292 nr_free_pages 7921 nr_zone_inactive_anon 9525 nr_zone_active_anon 22102 nr_zone_inactive_file 9826 nr_zone_active_file 13341
Re: Still OOM problems with 4.9er/4.10er kernels
On 17.03.2017 21:08, Gerhard Wiesinger wrote: On 17.03.2017 18:13, Michal Hocko wrote: On Fri 17-03-17 17:37:48, Gerhard Wiesinger wrote: [...] 4.11.0-0.rc2.git4.1.fc27.x86_64 There are also lockups after some runtime hours to 1 day: Message from syslogd@myserver Mar 19 08:22:33 ... kernel:BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 18717s! Message from syslogd@myserver at Mar 19 08:22:33 ... kernel:BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 18078s! repeated a lot of times Ciao, Gerhard
Re: Still OOM problems with 4.9er/4.10er kernels
On 17.03.2017 21:08, Gerhard Wiesinger wrote: On 17.03.2017 18:13, Michal Hocko wrote: On Fri 17-03-17 17:37:48, Gerhard Wiesinger wrote: [...] 4.11.0-0.rc2.git4.1.fc27.x86_64 There are also lockups after some runtime hours to 1 day: Message from syslogd@myserver Mar 19 08:22:33 ... kernel:BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 18717s! Message from syslogd@myserver at Mar 19 08:22:33 ... kernel:BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 18078s! repeated a lot of times Ciao, Gerhard
Re: Still OOM problems with 4.9er/4.10er kernels
On 17.03.2017 18:13, Michal Hocko wrote: On Fri 17-03-17 17:37:48, Gerhard Wiesinger wrote: [...] Why does the kernel prefer to swapin/out and not use a.) the free memory? It will use all the free memory up to min watermark which is set up based on min_free_kbytes. Makes sense, how is /proc/sys/vm/min_free_kbytes default value calculated? b.) the buffer/cache? the memory reclaim is strongly biased towards page cache and we try to avoid swapout as much as possible (see get_scan_count). If I understand it correctly, swapping is preferred over dropping the cache, right. Can this behaviour be changed to prefer dropping the cache to some minimum amount? Is this also configurable in a way? (As far as I remember e.g. kernel 2.4 dropped the caches well). There is ~100M memory available but kernel swaps all the time ... Any ideas? Kernel: 4.9.14-200.fc25.x86_64 top - 17:33:43 up 28 min, 3 users, load average: 3.58, 1.67, 0.89 Tasks: 145 total, 4 running, 141 sleeping, 0 stopped, 0 zombie %Cpu(s): 19.1 us, 56.2 sy, 0.0 ni, 4.3 id, 13.4 wa, 2.0 hi, 0.3 si, 4.7 st KiB Mem : 230076 total,61508 free, 123472 used,45096 buff/cache procs ---memory-- ---swap-- -io -system-- --cpu- r b swpd free buff cache si sobibo in cs us sy id wa st 3 5 303916 60372328 43864 27828 200 41420 236 6984 11138 11 47 6 23 14 I am really surprised to see any reclaim at all. 26% of free memory doesn't sound as if we should do a reclaim at all. Do you have an unusual configuration of /proc/sys/vm/min_free_kbytes ? Or is there anything running inside a memory cgroup with a small limit? nothing special set regarding /proc/sys/vm/min_free_kbytes (default values), detailed config below. Regarding cgroups, none of I know. How to check (I guess nothing is set because cg* commands are not available)? cat /etc/sysctl.d/* | grep "^vm" vm.dirty_background_ratio = 3 vm.dirty_ratio = 15 vm.overcommit_memory = 2 vm.overcommit_ratio = 80 vm.swappiness=10 find /proc/sys/vm -type f -exec echo {} \; -exec cat {} \; /proc/sys/vm/admin_reserve_kbytes 8192 /proc/sys/vm/block_dump 0 /proc/sys/vm/compact_memory cat: /proc/sys/vm/compact_memory: Permission denied /proc/sys/vm/compact_unevictable_allowed 1 /proc/sys/vm/dirty_background_bytes 0 /proc/sys/vm/dirty_background_ratio 3 /proc/sys/vm/dirty_bytes 0 /proc/sys/vm/dirty_expire_centisecs 3000 /proc/sys/vm/dirty_ratio 15 /proc/sys/vm/dirty_writeback_centisecs 500 /proc/sys/vm/dirtytime_expire_seconds 43200 /proc/sys/vm/drop_caches 0 /proc/sys/vm/extfrag_threshold 500 /proc/sys/vm/hugepages_treat_as_movable 0 /proc/sys/vm/hugetlb_shm_group 0 /proc/sys/vm/laptop_mode 0 /proc/sys/vm/legacy_va_layout 0 /proc/sys/vm/lowmem_reserve_ratio 256 256 32 1 /proc/sys/vm/max_map_count 65530 /proc/sys/vm/memory_failure_early_kill 0 /proc/sys/vm/memory_failure_recovery 1 /proc/sys/vm/min_free_kbytes 45056 /proc/sys/vm/min_slab_ratio 5 /proc/sys/vm/min_unmapped_ratio 1 /proc/sys/vm/mmap_min_addr 65536 /proc/sys/vm/mmap_rnd_bits 28 /proc/sys/vm/mmap_rnd_compat_bits 8 /proc/sys/vm/nr_hugepages 0 /proc/sys/vm/nr_hugepages_mempolicy 0 /proc/sys/vm/nr_overcommit_hugepages 0 /proc/sys/vm/nr_pdflush_threads 0 /proc/sys/vm/numa_zonelist_order default /proc/sys/vm/oom_dump_tasks 1 /proc/sys/vm/oom_kill_allocating_task 0 /proc/sys/vm/overcommit_kbytes 0 /proc/sys/vm/overcommit_memory 2 /proc/sys/vm/overcommit_ratio 80 /proc/sys/vm/page-cluster 3 /proc/sys/vm/panic_on_oom 0 /proc/sys/vm/percpu_pagelist_fraction 0 /proc/sys/vm/stat_interval 1 /proc/sys/vm/stat_refresh /proc/sys/vm/swappiness 10 /proc/sys/vm/user_reserve_kbytes 31036 /proc/sys/vm/vfs_cache_pressure 100 /proc/sys/vm/watermark_scale_factor 10 /proc/sys/vm/zone_reclaim_mode 0 Thnx. Ciao, Gerhard
Re: Still OOM problems with 4.9er/4.10er kernels
On 17.03.2017 18:13, Michal Hocko wrote: On Fri 17-03-17 17:37:48, Gerhard Wiesinger wrote: [...] Why does the kernel prefer to swapin/out and not use a.) the free memory? It will use all the free memory up to min watermark which is set up based on min_free_kbytes. Makes sense, how is /proc/sys/vm/min_free_kbytes default value calculated? b.) the buffer/cache? the memory reclaim is strongly biased towards page cache and we try to avoid swapout as much as possible (see get_scan_count). If I understand it correctly, swapping is preferred over dropping the cache, right. Can this behaviour be changed to prefer dropping the cache to some minimum amount? Is this also configurable in a way? (As far as I remember e.g. kernel 2.4 dropped the caches well). There is ~100M memory available but kernel swaps all the time ... Any ideas? Kernel: 4.9.14-200.fc25.x86_64 top - 17:33:43 up 28 min, 3 users, load average: 3.58, 1.67, 0.89 Tasks: 145 total, 4 running, 141 sleeping, 0 stopped, 0 zombie %Cpu(s): 19.1 us, 56.2 sy, 0.0 ni, 4.3 id, 13.4 wa, 2.0 hi, 0.3 si, 4.7 st KiB Mem : 230076 total,61508 free, 123472 used,45096 buff/cache procs ---memory-- ---swap-- -io -system-- --cpu- r b swpd free buff cache si sobibo in cs us sy id wa st 3 5 303916 60372328 43864 27828 200 41420 236 6984 11138 11 47 6 23 14 I am really surprised to see any reclaim at all. 26% of free memory doesn't sound as if we should do a reclaim at all. Do you have an unusual configuration of /proc/sys/vm/min_free_kbytes ? Or is there anything running inside a memory cgroup with a small limit? nothing special set regarding /proc/sys/vm/min_free_kbytes (default values), detailed config below. Regarding cgroups, none of I know. How to check (I guess nothing is set because cg* commands are not available)? cat /etc/sysctl.d/* | grep "^vm" vm.dirty_background_ratio = 3 vm.dirty_ratio = 15 vm.overcommit_memory = 2 vm.overcommit_ratio = 80 vm.swappiness=10 find /proc/sys/vm -type f -exec echo {} \; -exec cat {} \; /proc/sys/vm/admin_reserve_kbytes 8192 /proc/sys/vm/block_dump 0 /proc/sys/vm/compact_memory cat: /proc/sys/vm/compact_memory: Permission denied /proc/sys/vm/compact_unevictable_allowed 1 /proc/sys/vm/dirty_background_bytes 0 /proc/sys/vm/dirty_background_ratio 3 /proc/sys/vm/dirty_bytes 0 /proc/sys/vm/dirty_expire_centisecs 3000 /proc/sys/vm/dirty_ratio 15 /proc/sys/vm/dirty_writeback_centisecs 500 /proc/sys/vm/dirtytime_expire_seconds 43200 /proc/sys/vm/drop_caches 0 /proc/sys/vm/extfrag_threshold 500 /proc/sys/vm/hugepages_treat_as_movable 0 /proc/sys/vm/hugetlb_shm_group 0 /proc/sys/vm/laptop_mode 0 /proc/sys/vm/legacy_va_layout 0 /proc/sys/vm/lowmem_reserve_ratio 256 256 32 1 /proc/sys/vm/max_map_count 65530 /proc/sys/vm/memory_failure_early_kill 0 /proc/sys/vm/memory_failure_recovery 1 /proc/sys/vm/min_free_kbytes 45056 /proc/sys/vm/min_slab_ratio 5 /proc/sys/vm/min_unmapped_ratio 1 /proc/sys/vm/mmap_min_addr 65536 /proc/sys/vm/mmap_rnd_bits 28 /proc/sys/vm/mmap_rnd_compat_bits 8 /proc/sys/vm/nr_hugepages 0 /proc/sys/vm/nr_hugepages_mempolicy 0 /proc/sys/vm/nr_overcommit_hugepages 0 /proc/sys/vm/nr_pdflush_threads 0 /proc/sys/vm/numa_zonelist_order default /proc/sys/vm/oom_dump_tasks 1 /proc/sys/vm/oom_kill_allocating_task 0 /proc/sys/vm/overcommit_kbytes 0 /proc/sys/vm/overcommit_memory 2 /proc/sys/vm/overcommit_ratio 80 /proc/sys/vm/page-cluster 3 /proc/sys/vm/panic_on_oom 0 /proc/sys/vm/percpu_pagelist_fraction 0 /proc/sys/vm/stat_interval 1 /proc/sys/vm/stat_refresh /proc/sys/vm/swappiness 10 /proc/sys/vm/user_reserve_kbytes 31036 /proc/sys/vm/vfs_cache_pressure 100 /proc/sys/vm/watermark_scale_factor 10 /proc/sys/vm/zone_reclaim_mode 0 Thnx. Ciao, Gerhard
Re: Still OOM problems with 4.9er/4.10er kernels
On 16.03.2017 10:39, Michal Hocko wrote: On Thu 16-03-17 02:23:18, l...@pengaru.com wrote: On Thu, Mar 16, 2017 at 10:08:44AM +0100, Michal Hocko wrote: On Thu 16-03-17 01:47:33, l...@pengaru.com wrote: [...] While on the topic of understanding allocation stalls, Philip Freeman recently mailed linux-kernel with a similar report, and in his case there are plenty of page cache pages. It was also a GFP_HIGHUSER_MOVABLE 0-order allocation. care to point me to the report? http://lkml.iu.edu/hypermail/linux/kernel/1703.1/06360.html Thanks. It is gone from my lkml mailbox. Could you CC me (and linux-mm) please? I'm no MM expert, but it appears a bit broken for such a low-order allocation to stall on the order of 10 seconds when there's plenty of reclaimable pages, in addition to mostly unused and abundant swap space on SSD. yes this might indeed signal a problem. Well maybe I missed something obvious that a better informed eye will catch. Nothing really obvious. There is indeed a lot of anonymous memory to swap out. Almost no pages on file LRU lists (active_file:759 inactive_file:749) but 158783 total pagecache pages so we have to have a lot of pages in the swap cache. I would probably have to see more data to make a full picture. Why does the kernel prefer to swapin/out and not use a.) the free memory? b.) the buffer/cache? There is ~100M memory available but kernel swaps all the time ... Any ideas? Kernel: 4.9.14-200.fc25.x86_64 top - 17:33:43 up 28 min, 3 users, load average: 3.58, 1.67, 0.89 Tasks: 145 total, 4 running, 141 sleeping, 0 stopped, 0 zombie %Cpu(s): 19.1 us, 56.2 sy, 0.0 ni, 4.3 id, 13.4 wa, 2.0 hi, 0.3 si, 4.7 st KiB Mem : 230076 total,61508 free, 123472 used,45096 buff/cache procs ---memory-- ---swap-- -io -system-- --cpu- r b swpd free buff cache si sobibo in cs us sy id wa st 3 5 303916 60372328 43864 27828 200 41420 236 6984 11138 11 47 6 23 14 5 4 292852 52904756 58584 19600 448 48780 540 8088 10528 18 61 1 7 13 3 3 288792 49052 1152 65924 4856 576 9824 1100 4324 5720 7 18 2 64 8 2 2 283676 54160716 67604 6332 344 31740 964 3879 5055 12 34 10 37 7 3 3 286852 66712216 53136 28064 4832 56532 4920 9175 12625 10 55 12 14 10 2 0 299680 62428196 53316 36312 13164 54728 13212 16820 25283 7 56 18 12 7 1 1 300756 63220624 58160 17944 1260 24528 1304 5804 9302 3 22 38 34 3 Thnx. Ciao, Gerhard
Re: Still OOM problems with 4.9er/4.10er kernels
On 16.03.2017 10:39, Michal Hocko wrote: On Thu 16-03-17 02:23:18, l...@pengaru.com wrote: On Thu, Mar 16, 2017 at 10:08:44AM +0100, Michal Hocko wrote: On Thu 16-03-17 01:47:33, l...@pengaru.com wrote: [...] While on the topic of understanding allocation stalls, Philip Freeman recently mailed linux-kernel with a similar report, and in his case there are plenty of page cache pages. It was also a GFP_HIGHUSER_MOVABLE 0-order allocation. care to point me to the report? http://lkml.iu.edu/hypermail/linux/kernel/1703.1/06360.html Thanks. It is gone from my lkml mailbox. Could you CC me (and linux-mm) please? I'm no MM expert, but it appears a bit broken for such a low-order allocation to stall on the order of 10 seconds when there's plenty of reclaimable pages, in addition to mostly unused and abundant swap space on SSD. yes this might indeed signal a problem. Well maybe I missed something obvious that a better informed eye will catch. Nothing really obvious. There is indeed a lot of anonymous memory to swap out. Almost no pages on file LRU lists (active_file:759 inactive_file:749) but 158783 total pagecache pages so we have to have a lot of pages in the swap cache. I would probably have to see more data to make a full picture. Why does the kernel prefer to swapin/out and not use a.) the free memory? b.) the buffer/cache? There is ~100M memory available but kernel swaps all the time ... Any ideas? Kernel: 4.9.14-200.fc25.x86_64 top - 17:33:43 up 28 min, 3 users, load average: 3.58, 1.67, 0.89 Tasks: 145 total, 4 running, 141 sleeping, 0 stopped, 0 zombie %Cpu(s): 19.1 us, 56.2 sy, 0.0 ni, 4.3 id, 13.4 wa, 2.0 hi, 0.3 si, 4.7 st KiB Mem : 230076 total,61508 free, 123472 used,45096 buff/cache procs ---memory-- ---swap-- -io -system-- --cpu- r b swpd free buff cache si sobibo in cs us sy id wa st 3 5 303916 60372328 43864 27828 200 41420 236 6984 11138 11 47 6 23 14 5 4 292852 52904756 58584 19600 448 48780 540 8088 10528 18 61 1 7 13 3 3 288792 49052 1152 65924 4856 576 9824 1100 4324 5720 7 18 2 64 8 2 2 283676 54160716 67604 6332 344 31740 964 3879 5055 12 34 10 37 7 3 3 286852 66712216 53136 28064 4832 56532 4920 9175 12625 10 55 12 14 10 2 0 299680 62428196 53316 36312 13164 54728 13212 16820 25283 7 56 18 12 7 1 1 300756 63220624 58160 17944 1260 24528 1304 5804 9302 3 22 38 34 3 Thnx. Ciao, Gerhard
Re: Still OOM problems with 4.9er/4.10er kernels
On 02.03.2017 08:17, Minchan Kim wrote: Hi Michal, On Tue, Feb 28, 2017 at 09:12:24AM +0100, Michal Hocko wrote: On Tue 28-02-17 14:17:23, Minchan Kim wrote: On Mon, Feb 27, 2017 at 10:44:49AM +0100, Michal Hocko wrote: On Mon 27-02-17 18:02:36, Minchan Kim wrote: [...] >From 9779a1c5d32e2edb64da5cdfcd6f9737b94a247a Mon Sep 17 00:00:00 2001 From: Minchan KimDate: Mon, 27 Feb 2017 17:39:06 +0900 Subject: [PATCH] mm: use up highatomic before OOM kill Not-Yet-Signed-off-by: Minchan Kim --- mm/page_alloc.c | 14 -- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 614cd0397ce3..e073cca4969e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3549,16 +3549,6 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, *no_progress_loops = 0; else (*no_progress_loops)++; - - /* -* Make sure we converge to OOM if we cannot make any progress -* several times in the row. -*/ - if (*no_progress_loops > MAX_RECLAIM_RETRIES) { - /* Before OOM, exhaust highatomic_reserve */ - return unreserve_highatomic_pageblock(ac, true); - } - /* * Keep reclaiming pages while there is a chance this will lead * somewhere. If none of the target zones can satisfy our allocation @@ -3821,6 +3811,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, if (read_mems_allowed_retry(cpuset_mems_cookie)) goto retry_cpuset; + /* Before OOM, exhaust highatomic_reserve */ + if (unreserve_highatomic_pageblock(ac, true)) + goto retry; + OK, this can help for higher order requests when we do not exhaust all the retries and fail on compaction but I fail to see how can this help for order-0 requets which was what happened in this case. I am not saying this is wrong, though. The should_reclaim_retry can return false although no_progress_loop is less than MAX_RECLAIM_RETRIES unless eligible zones has enough reclaimable pages by the progress_loop. Yes, sorry I should have been more clear. I was talking about this particular case where we had a lot of reclaimable pages (a lot of anonymous with the swap available). This reports shows two problems. Why we see OOM 1) enough *free* pages and 2) enough *freeable* pages. I just pointed out 1) and sent the patch to solve it. About 2), one of my imaginary scenario is inactive anon list is full of pinned pages so VM can unmap them successfully in shrink_page_list but fail to free due to increased page refcount. In that case, the page will be added to inactive anonymous LRU list again without activating so inactive_list_is_low on anonymous LRU is always false. IOW, there is no deactivation from active list. It's just my picture without no clue. ;-) With latest kernels (4.11.0-0.rc2.git0.2.fc26.x86_64) I'm having the issue that swapping is active all the time after some runtime (~1day). top - 07:30:17 up 1 day, 19:42, 1 user, load average: 13.71, 16.98, 15.36 Tasks: 130 total, 2 running, 128 sleeping, 0 stopped, 0 zombie %Cpu(s): 15.8 us, 33.5 sy, 0.0 ni, 3.9 id, 34.5 wa, 4.9 hi, 1.0 si, 6.4 st KiB Mem : 369700 total, 5484 free, 311556 used, 52660 buff/cache KiB Swap: 2064380 total, 1187684 free, 876696 used. 20340 avail Mem [root@smtp ~]# vmstat 1 procs ---memory-- ---swap-- -io -system-- --cpu- r b swpd free buff cache si sobibo in cs us sy id wa st 3 1 876280 7132 16536 64840 238 226 1027 258 80 97 2 3 83 11 1 0 4 876140 3812 10520 64552 3676 168 11840 1100 2255 2582 7 13 8 70 3 0 3 875372 3628 4024 56160 5424 64 10004 476 2157 2580 2 14 0 83 2 0 4 875560 24056 2208 56296 9032 2180 39928 2388 4111 4549 10 32 0 55 3 2 2 875660 7540 5256 58220 5536 1604 48756 1864 4505 4196 12 23 5 58 3 0 3 875264 3664 2120 57596 2304 116 17904 560 2223 1825 15 15 0 67 3 0 2 875564 3800588 57856 1340 1068 14780 1184 1390 1364 12 10 0 77 3 1 2 875724 3740372 53988 3104 928 16884 1068 1560 1527 3 12 0 83 3 0 3 881096 3708532 52220 4604 5872 21004 6104 2752 2259 7 18 5 67 2 The following commit is included in that version: commit 710531320af876192d76b2c1f68190a1df941b02 Author: Michal Hocko Date: Wed Feb 22 15:45:58 2017 -0800 mm, vmscan: cleanup lru size claculations commit fd538803731e50367b7c59ce4ad3454426a3d671 upstream. But still OOMs: [157048.030760] clamscan: page allocation stalls for 19405ms, order:0, mode:0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null) [157048.031985] clamscan cpuset=/ mems_allowed=0 [157048.031993] CPU: 1 PID: 9597 Comm: clamscan Not tainted 4.11.0-0.rc2.git0.2.fc26.x86_64 #1 [157048.033197] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Re: Still OOM problems with 4.9er/4.10er kernels
On 02.03.2017 08:17, Minchan Kim wrote: Hi Michal, On Tue, Feb 28, 2017 at 09:12:24AM +0100, Michal Hocko wrote: On Tue 28-02-17 14:17:23, Minchan Kim wrote: On Mon, Feb 27, 2017 at 10:44:49AM +0100, Michal Hocko wrote: On Mon 27-02-17 18:02:36, Minchan Kim wrote: [...] >From 9779a1c5d32e2edb64da5cdfcd6f9737b94a247a Mon Sep 17 00:00:00 2001 From: Minchan Kim Date: Mon, 27 Feb 2017 17:39:06 +0900 Subject: [PATCH] mm: use up highatomic before OOM kill Not-Yet-Signed-off-by: Minchan Kim --- mm/page_alloc.c | 14 -- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 614cd0397ce3..e073cca4969e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3549,16 +3549,6 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, *no_progress_loops = 0; else (*no_progress_loops)++; - - /* -* Make sure we converge to OOM if we cannot make any progress -* several times in the row. -*/ - if (*no_progress_loops > MAX_RECLAIM_RETRIES) { - /* Before OOM, exhaust highatomic_reserve */ - return unreserve_highatomic_pageblock(ac, true); - } - /* * Keep reclaiming pages while there is a chance this will lead * somewhere. If none of the target zones can satisfy our allocation @@ -3821,6 +3811,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, if (read_mems_allowed_retry(cpuset_mems_cookie)) goto retry_cpuset; + /* Before OOM, exhaust highatomic_reserve */ + if (unreserve_highatomic_pageblock(ac, true)) + goto retry; + OK, this can help for higher order requests when we do not exhaust all the retries and fail on compaction but I fail to see how can this help for order-0 requets which was what happened in this case. I am not saying this is wrong, though. The should_reclaim_retry can return false although no_progress_loop is less than MAX_RECLAIM_RETRIES unless eligible zones has enough reclaimable pages by the progress_loop. Yes, sorry I should have been more clear. I was talking about this particular case where we had a lot of reclaimable pages (a lot of anonymous with the swap available). This reports shows two problems. Why we see OOM 1) enough *free* pages and 2) enough *freeable* pages. I just pointed out 1) and sent the patch to solve it. About 2), one of my imaginary scenario is inactive anon list is full of pinned pages so VM can unmap them successfully in shrink_page_list but fail to free due to increased page refcount. In that case, the page will be added to inactive anonymous LRU list again without activating so inactive_list_is_low on anonymous LRU is always false. IOW, there is no deactivation from active list. It's just my picture without no clue. ;-) With latest kernels (4.11.0-0.rc2.git0.2.fc26.x86_64) I'm having the issue that swapping is active all the time after some runtime (~1day). top - 07:30:17 up 1 day, 19:42, 1 user, load average: 13.71, 16.98, 15.36 Tasks: 130 total, 2 running, 128 sleeping, 0 stopped, 0 zombie %Cpu(s): 15.8 us, 33.5 sy, 0.0 ni, 3.9 id, 34.5 wa, 4.9 hi, 1.0 si, 6.4 st KiB Mem : 369700 total, 5484 free, 311556 used, 52660 buff/cache KiB Swap: 2064380 total, 1187684 free, 876696 used. 20340 avail Mem [root@smtp ~]# vmstat 1 procs ---memory-- ---swap-- -io -system-- --cpu- r b swpd free buff cache si sobibo in cs us sy id wa st 3 1 876280 7132 16536 64840 238 226 1027 258 80 97 2 3 83 11 1 0 4 876140 3812 10520 64552 3676 168 11840 1100 2255 2582 7 13 8 70 3 0 3 875372 3628 4024 56160 5424 64 10004 476 2157 2580 2 14 0 83 2 0 4 875560 24056 2208 56296 9032 2180 39928 2388 4111 4549 10 32 0 55 3 2 2 875660 7540 5256 58220 5536 1604 48756 1864 4505 4196 12 23 5 58 3 0 3 875264 3664 2120 57596 2304 116 17904 560 2223 1825 15 15 0 67 3 0 2 875564 3800588 57856 1340 1068 14780 1184 1390 1364 12 10 0 77 3 1 2 875724 3740372 53988 3104 928 16884 1068 1560 1527 3 12 0 83 3 0 3 881096 3708532 52220 4604 5872 21004 6104 2752 2259 7 18 5 67 2 The following commit is included in that version: commit 710531320af876192d76b2c1f68190a1df941b02 Author: Michal Hocko Date: Wed Feb 22 15:45:58 2017 -0800 mm, vmscan: cleanup lru size claculations commit fd538803731e50367b7c59ce4ad3454426a3d671 upstream. But still OOMs: [157048.030760] clamscan: page allocation stalls for 19405ms, order:0, mode:0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null) [157048.031985] clamscan cpuset=/ mems_allowed=0 [157048.031993] CPU: 1 PID: 9597 Comm: clamscan Not tainted 4.11.0-0.rc2.git0.2.fc26.x86_64 #1 [157048.033197] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 04/01/2014 [157048.034382] Call Trace:
Re: Still OOM problems with 4.9er/4.10er kernels
On 27.02.2017 09:27, Michal Hocko wrote: On Sun 26-02-17 09:40:42, Gerhard Wiesinger wrote: On 04.01.2017 10:11, Michal Hocko wrote: The VM stops working (e.g. not pingable) after around 8h (will be restarted automatically), happened serveral times. Had also further OOMs which I sent to Mincham. Could you post them to the mailing list as well, please? Still OOMs on dnf update procedure with kernel 4.10: 4.10.0-1.fc26.x86_64 as well on 4.9.9-200.fc25.x86_64 On 4.10er kernels: [...] kernel: Node 0 DMA32 free:5012kB min:2264kB low:2828kB high:3392kB active_anon:143580kB inactive_anon:143300kB active_file:2576kB inactive_file:2560kB unevictable:0kB writepending:0kB present:376688kB managed:353968kB mlocked:0kB slab_reclaimable:13708kB slab_unreclaimable:18064kB kernel_stack:2352kB pagetables:12888kB bounce:0kB free_pcp:412kB local_pcp:88kB free_cma:0kB [...] On 4.9er kernels: [...] kernel: Node 0 DMA32 free:3356kB min:2668kB low:3332kB high:3996kB active_anon:122148kB inactive_anon:112068kB active_file:81324kB inactive_file:101972kB unevictable:0kB writepending:4648kB present:507760kB managed:484384kB mlocked:0kB slab_reclaimable:17660kB slab_unreclaimable:21404kB kernel_stack:2432kB pagetables:10124kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB In both cases the amount if free memory is above the min watermark, so we shouldn't be hitting the oom. We might have somebody freeing memory after the last attempt, though... [...] Should be very easy to reproduce with a low mem VM (e.g. 192MB) under KVM with ext4 and Fedora 25 and some memory load and updating the VM. Any further progress? The linux-next (resp. mmotm tree) has new tracepoints which should help to tell us more about what is going on here. Could you try to enable oom/reclaim_retry_zone and vmscan/mm_vmscan_direct_reclaim_{begin,end} Is this available in this version? https://koji.fedoraproject.org/koji/buildinfo?buildID=862775 kernel-4.11.0-0.rc0.git5.1.fc26 How to enable? Thnx. Ciao, gerhard
Re: Still OOM problems with 4.9er/4.10er kernels
On 27.02.2017 09:27, Michal Hocko wrote: On Sun 26-02-17 09:40:42, Gerhard Wiesinger wrote: On 04.01.2017 10:11, Michal Hocko wrote: The VM stops working (e.g. not pingable) after around 8h (will be restarted automatically), happened serveral times. Had also further OOMs which I sent to Mincham. Could you post them to the mailing list as well, please? Still OOMs on dnf update procedure with kernel 4.10: 4.10.0-1.fc26.x86_64 as well on 4.9.9-200.fc25.x86_64 On 4.10er kernels: [...] kernel: Node 0 DMA32 free:5012kB min:2264kB low:2828kB high:3392kB active_anon:143580kB inactive_anon:143300kB active_file:2576kB inactive_file:2560kB unevictable:0kB writepending:0kB present:376688kB managed:353968kB mlocked:0kB slab_reclaimable:13708kB slab_unreclaimable:18064kB kernel_stack:2352kB pagetables:12888kB bounce:0kB free_pcp:412kB local_pcp:88kB free_cma:0kB [...] On 4.9er kernels: [...] kernel: Node 0 DMA32 free:3356kB min:2668kB low:3332kB high:3996kB active_anon:122148kB inactive_anon:112068kB active_file:81324kB inactive_file:101972kB unevictable:0kB writepending:4648kB present:507760kB managed:484384kB mlocked:0kB slab_reclaimable:17660kB slab_unreclaimable:21404kB kernel_stack:2432kB pagetables:10124kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB In both cases the amount if free memory is above the min watermark, so we shouldn't be hitting the oom. We might have somebody freeing memory after the last attempt, though... [...] Should be very easy to reproduce with a low mem VM (e.g. 192MB) under KVM with ext4 and Fedora 25 and some memory load and updating the VM. Any further progress? The linux-next (resp. mmotm tree) has new tracepoints which should help to tell us more about what is going on here. Could you try to enable oom/reclaim_retry_zone and vmscan/mm_vmscan_direct_reclaim_{begin,end} Is this available in this version? https://koji.fedoraproject.org/koji/buildinfo?buildID=862775 kernel-4.11.0-0.rc0.git5.1.fc26 How to enable? Thnx. Ciao, gerhard
Re: Still OOM problems with 4.9er/4.10er kernels
On 04.01.2017 10:11, Michal Hocko wrote: The VM stops working (e.g. not pingable) after around 8h (will be restarted automatically), happened serveral times. Had also further OOMs which I sent to Mincham. Could you post them to the mailing list as well, please? Still OOMs on dnf update procedure with kernel 4.10: 4.10.0-1.fc26.x86_64 as well on 4.9.9-200.fc25.x86_64 On 4.10er kernels: Free swap = 1137532kB cat /etc/sysctl.d/* | grep ^vm vm.dirty_background_ratio = 3 vm.dirty_ratio = 15 vm.overcommit_memory = 2 vm.overcommit_ratio = 80 vm.swappiness=10 kernel: python invoked oom-killer: gfp_mask=0x14201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=0, order=0, oom_score_adj=0 kernel: python cpuset=/ mems_allowed=0 kernel: CPU: 1 PID: 813 Comm: python Not tainted 4.10.0-1.fc26.x86_64 #1 kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 04/01/2014 kernel: Call Trace: kernel: dump_stack+0x63/0x84 kernel: dump_header+0x7b/0x1f6 kernel: ? do_try_to_free_pages+0x2c5/0x340 kernel: oom_kill_process+0x202/0x3d0 kernel: out_of_memory+0x2b7/0x4e0 kernel: __alloc_pages_slowpath+0x915/0xb80 kernel: __alloc_pages_nodemask+0x218/0x2d0 kernel: alloc_pages_current+0x93/0x150 kernel: __page_cache_alloc+0xcf/0x100 kernel: filemap_fault+0x39d/0x800 kernel: ? page_add_file_rmap+0xe5/0x200 kernel: ? filemap_map_pages+0x2e1/0x4e0 kernel: ext4_filemap_fault+0x36/0x50 kernel: __do_fault+0x21/0x110 kernel: handle_mm_fault+0xdd1/0x1410 kernel: ? swake_up+0x42/0x50 kernel: __do_page_fault+0x23f/0x4c0 kernel: trace_do_page_fault+0x41/0x120 kernel: do_async_page_fault+0x51/0xa0 kernel: async_page_fault+0x28/0x30 kernel: RIP: 0033:0x7f0681ad6350 kernel: RSP: 002b:7ffcbdd238d8 EFLAGS: 00010246 kernel: RAX: 7f0681b0f960 RBX: RCX: 7fff kernel: RDX: RSI: 3ff0 RDI: 3ff0 kernel: RBP: 7f067461ab40 R08: R09: 3ff0 kernel: R10: 556f1c6d8a80 R11: 0001 R12: 7f0676d1a8d0 kernel: R13: R14: 7f06746168bc R15: 7f0674385910 kernel: Mem-Info: kernel: active_anon:37423 inactive_anon:37512 isolated_anon:0 active_file:462 inactive_file:603 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 slab_reclaimable:3538 slab_unreclaimable:4818 mapped:859 shmem:9 pagetables:3370 bounce:0 free:1650 free_pcp:103 free_cma:0 kernel: Node 0 active_anon:149380kB inactive_anon:149704kB active_file:1848kB inactive_file:3660kB unevictable:0kB isolated(anon):128kB isolated(file):0kB mapped:4580kB dirty:0kB writeback:380kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 36kB writeback_tmp:0kB unstable:0kB pages_scanned:352 all_unreclaimable? no kernel: Node 0 DMA free:1484kB min:104kB low:128kB high:152kB active_anon:5660kB inactive_anon:6156kB active_file:56kB inactive_file:64kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:444kB slab_unreclaimable:1208kB kernel_stack:32kB pagetables:592kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB kernel: lowmem_reserve[]: 0 327 327 327 327 kernel: Node 0 DMA32 free:5012kB min:2264kB low:2828kB high:3392kB active_anon:143580kB inactive_anon:143300kB active_file:2576kB inactive_file:2560kB unevictable:0kB writepending:0kB present:376688kB managed:353968kB mlocked:0kB slab_reclaimable:13708kB slab_unreclaimable:18064kB kernel_stack:2352kB pagetables:12888kB bounce:0kB free_pcp:412kB local_pcp:88kB free_cma:0kB kernel: lowmem_reserve[]: 0 0 0 0 0 kernel: Node 0 DMA: 70*4kB (UMEH) 20*8kB (UMEH) 13*16kB (MH) 5*32kB (H) 4*64kB (H) 2*128kB (H) 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1576kB kernel: Node 0 DMA32: 1134*4kB (UMEH) 25*8kB (UMEH) 13*16kB (MH) 7*32kB (H) 3*64kB (H) 0*128kB 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 5616kB kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB kernel: 6561 total pagecache pages kernel: 5240 pages in swap cache kernel: Swap cache stats: add 100078658, delete 100073419, find 199458343/238460223 kernel: Free swap = 1137532kB kernel: Total swap = 2064380kB kernel: 98170 pages RAM kernel: 0 pages HighMem/MovableOnly kernel: 5701 pages reserved kernel: 0 pages cma reserved kernel: 0 pages hwpoisoned kernel: Out of memory: Kill process 11968 (clamscan) score 170 or sacrifice child kernel: Killed process 11968 (clamscan) total-vm:538120kB, anon-rss:182220kB, file-rss:464kB, shmem-rss:0kB On 4.9er kernels: Free swap = 1826688kB cat /etc/sysctl.d/* | grep ^vm vm.dirty_background_ratio=3 vm.dirty_ratio=15 vm.overcommit_memory=2 vm.overcommit_ratio=80 vm.swappiness=10 kernel: dnf invoked oom-killer: gfp_mask=0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=0, order=0, oom_score_adj=0 kernel: dnf cpuset=/ mems_allowed=0 kernel: CPU: 0 PID: 20049 Comm: dnf Not tainted 4.9.9-200.fc25.x86_64 #1 kernel:
Re: Still OOM problems with 4.9er/4.10er kernels
On 04.01.2017 10:11, Michal Hocko wrote: The VM stops working (e.g. not pingable) after around 8h (will be restarted automatically), happened serveral times. Had also further OOMs which I sent to Mincham. Could you post them to the mailing list as well, please? Still OOMs on dnf update procedure with kernel 4.10: 4.10.0-1.fc26.x86_64 as well on 4.9.9-200.fc25.x86_64 On 4.10er kernels: Free swap = 1137532kB cat /etc/sysctl.d/* | grep ^vm vm.dirty_background_ratio = 3 vm.dirty_ratio = 15 vm.overcommit_memory = 2 vm.overcommit_ratio = 80 vm.swappiness=10 kernel: python invoked oom-killer: gfp_mask=0x14201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=0, order=0, oom_score_adj=0 kernel: python cpuset=/ mems_allowed=0 kernel: CPU: 1 PID: 813 Comm: python Not tainted 4.10.0-1.fc26.x86_64 #1 kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 04/01/2014 kernel: Call Trace: kernel: dump_stack+0x63/0x84 kernel: dump_header+0x7b/0x1f6 kernel: ? do_try_to_free_pages+0x2c5/0x340 kernel: oom_kill_process+0x202/0x3d0 kernel: out_of_memory+0x2b7/0x4e0 kernel: __alloc_pages_slowpath+0x915/0xb80 kernel: __alloc_pages_nodemask+0x218/0x2d0 kernel: alloc_pages_current+0x93/0x150 kernel: __page_cache_alloc+0xcf/0x100 kernel: filemap_fault+0x39d/0x800 kernel: ? page_add_file_rmap+0xe5/0x200 kernel: ? filemap_map_pages+0x2e1/0x4e0 kernel: ext4_filemap_fault+0x36/0x50 kernel: __do_fault+0x21/0x110 kernel: handle_mm_fault+0xdd1/0x1410 kernel: ? swake_up+0x42/0x50 kernel: __do_page_fault+0x23f/0x4c0 kernel: trace_do_page_fault+0x41/0x120 kernel: do_async_page_fault+0x51/0xa0 kernel: async_page_fault+0x28/0x30 kernel: RIP: 0033:0x7f0681ad6350 kernel: RSP: 002b:7ffcbdd238d8 EFLAGS: 00010246 kernel: RAX: 7f0681b0f960 RBX: RCX: 7fff kernel: RDX: RSI: 3ff0 RDI: 3ff0 kernel: RBP: 7f067461ab40 R08: R09: 3ff0 kernel: R10: 556f1c6d8a80 R11: 0001 R12: 7f0676d1a8d0 kernel: R13: R14: 7f06746168bc R15: 7f0674385910 kernel: Mem-Info: kernel: active_anon:37423 inactive_anon:37512 isolated_anon:0 active_file:462 inactive_file:603 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 slab_reclaimable:3538 slab_unreclaimable:4818 mapped:859 shmem:9 pagetables:3370 bounce:0 free:1650 free_pcp:103 free_cma:0 kernel: Node 0 active_anon:149380kB inactive_anon:149704kB active_file:1848kB inactive_file:3660kB unevictable:0kB isolated(anon):128kB isolated(file):0kB mapped:4580kB dirty:0kB writeback:380kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 36kB writeback_tmp:0kB unstable:0kB pages_scanned:352 all_unreclaimable? no kernel: Node 0 DMA free:1484kB min:104kB low:128kB high:152kB active_anon:5660kB inactive_anon:6156kB active_file:56kB inactive_file:64kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:444kB slab_unreclaimable:1208kB kernel_stack:32kB pagetables:592kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB kernel: lowmem_reserve[]: 0 327 327 327 327 kernel: Node 0 DMA32 free:5012kB min:2264kB low:2828kB high:3392kB active_anon:143580kB inactive_anon:143300kB active_file:2576kB inactive_file:2560kB unevictable:0kB writepending:0kB present:376688kB managed:353968kB mlocked:0kB slab_reclaimable:13708kB slab_unreclaimable:18064kB kernel_stack:2352kB pagetables:12888kB bounce:0kB free_pcp:412kB local_pcp:88kB free_cma:0kB kernel: lowmem_reserve[]: 0 0 0 0 0 kernel: Node 0 DMA: 70*4kB (UMEH) 20*8kB (UMEH) 13*16kB (MH) 5*32kB (H) 4*64kB (H) 2*128kB (H) 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1576kB kernel: Node 0 DMA32: 1134*4kB (UMEH) 25*8kB (UMEH) 13*16kB (MH) 7*32kB (H) 3*64kB (H) 0*128kB 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 5616kB kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB kernel: 6561 total pagecache pages kernel: 5240 pages in swap cache kernel: Swap cache stats: add 100078658, delete 100073419, find 199458343/238460223 kernel: Free swap = 1137532kB kernel: Total swap = 2064380kB kernel: 98170 pages RAM kernel: 0 pages HighMem/MovableOnly kernel: 5701 pages reserved kernel: 0 pages cma reserved kernel: 0 pages hwpoisoned kernel: Out of memory: Kill process 11968 (clamscan) score 170 or sacrifice child kernel: Killed process 11968 (clamscan) total-vm:538120kB, anon-rss:182220kB, file-rss:464kB, shmem-rss:0kB On 4.9er kernels: Free swap = 1826688kB cat /etc/sysctl.d/* | grep ^vm vm.dirty_background_ratio=3 vm.dirty_ratio=15 vm.overcommit_memory=2 vm.overcommit_ratio=80 vm.swappiness=10 kernel: dnf invoked oom-killer: gfp_mask=0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=0, order=0, oom_score_adj=0 kernel: dnf cpuset=/ mems_allowed=0 kernel: CPU: 0 PID: 20049 Comm: dnf Not tainted 4.9.9-200.fc25.x86_64 #1 kernel:
Re: Still OOM problems with 4.9er kernels
On 23.12.2016 03:55, Minchan Kim wrote: On Fri, Dec 09, 2016 at 04:52:07PM +0100, Gerhard Wiesinger wrote: On 09.12.2016 14:40, Michal Hocko wrote: On Fri 09-12-16 08:06:25, Gerhard Wiesinger wrote: Hello, same with latest kernel rc, dnf still killed with OOM (but sometimes better). ./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} (does dnf clean all;dnf update) Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Updated bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Could you post your oom report please? E.g. a new one with more than one included, first one after boot ... Just setup a low mem VM under KVM and it is easily triggerable. Still enough virtual memory available ... 4.9.0-0.rc8.git2.1.fc26.x86_64 [ 624.862777] ksoftirqd/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC) [ 624.863319] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [ 624.863410] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [ 624.863510] aa62c007f958 904774e3 90c7dd98 [ 624.863923] aa62c007f9e0 9020e6ea 020800200246 90c7dd98 [ 624.864019] aa62c007f980 96b90010 aa62c007f9f0 aa62c007f9a0 [ 624.864998] Call Trace: [ 624.865149] [] dump_stack+0x86/0xc3 [ 624.865347] [] warn_alloc+0x13a/0x170 [ 624.865432] [] __alloc_pages_slowpath+0x252/0xbb0 [ 624.865563] [] __alloc_pages_nodemask+0x40d/0x4b0 [ 624.865675] [] __alloc_page_frag+0x193/0x200 [ 624.866024] [] __napi_alloc_skb+0x8e/0xf0 [ 624.866113] [] page_to_skb.isra.28+0x5d/0x310 [virtio_net] [ 624.866201] [] virtnet_receive+0x2db/0x9a0 [virtio_net] [ 624.867378] [] virtnet_poll+0x1d/0x80 [virtio_net] [ 624.867494] [] net_rx_action+0x23e/0x470 [ 624.867612] [] __do_softirq+0xcd/0x4b9 [ 624.867704] [] ? smpboot_thread_fn+0x34/0x1f0 [ 624.867833] [] ? smpboot_thread_fn+0x12d/0x1f0 [ 624.867924] [] run_ksoftirqd+0x25/0x80 [ 624.868109] [] smpboot_thread_fn+0x128/0x1f0 [ 624.868197] [] ? sort_range+0x30/0x30 [ 624.868596] [] kthread+0x102/0x120 [ 624.868679] [] ? wait_for_completion+0x110/0x140 [ 624.868768] [] ? kthread_park+0x60/0x60 [ 624.868850] [] ret_from_fork+0x2a/0x40 [ 843.528656] httpd (2490) used greatest stack depth: 10304 bytes left [ 878.077750] httpd (2976) used greatest stack depth: 10096 bytes left [93918.861109] netstat (14579) used greatest stack depth: 9488 bytes left [94050.874669] kworker/dying (6253) used greatest stack depth: 9008 bytes left [95895.765570] kworker/1:1H: page allocation failure: order:0, mode:0x2280020(GFP_ATOMIC|__GFP_NOTRACK) [95895.765819] CPU: 1 PID: 440 Comm: kworker/1:1H Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [95895.765911] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [95895.766060] Workqueue: kblockd blk_mq_run_work_fn [95895.766143] aa62c0257628 904774e3 90c7dd98 [95895.766235] aa62c02576b0 9020e6ea 022800200046 90c7dd98 [95895.766325] aa62c0257650 96b90010 aa62c02576c0 aa62c0257670 [95895.766417] Call Trace: [95895.766502] [] dump_stack+0x86/0xc3 [95895.766596] [] warn_alloc+0x13a/0x170 [95895.766681] [] __alloc_pages_slowpath+0x252/0xbb0 [95895.766767] [] __alloc_pages_nodemask+0x40d/0x4b0 [95895.766866] [] alloc_pages_current+0xa1/0x1f0 [95895.766971] [] ? _raw_spin_unlock+0x27/0x40 [95895.767073] [] new_slab+0x316/0x7c0 [95895.767160] [] ___slab_alloc+0x3fb/0x5c0 [95895.772611] [] ? cpuacct_charge+0xf2/0x1f0 [95895.773406] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.774327] [] ? rcu_read_lock_sched_held+0x45/0x80 [95895.775212] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.776155] [] __slab_alloc+0x51/0x90 [95895.777090] [] __kmalloc+0x251/0x320 [95895.781502] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.782309] [] alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.783334] [] virtqueue_add_sgs+0x1c3/0x4a0 [virtio_ring] [95895.784059] [] ? kvm_sched_clock_read+0x25/0x40 [95895.784742] [] __virtblk_add_req+0xbc/0x220 [virtio_blk] [95895.785419] [] ? debug_lockdep_rcu_enabled+0x1d/0x20 [95895.786086] [] ? virtio_queue_rq+0x105/0x290 [virtio_blk] [95895.786750] [] virtio_queue_rq+0x12d/0x290 [virtio_blk] [95895.787427] [] __blk_mq_run_hw_queue+0x26d/0x3b0 [95895.788106] [] blk_mq_run_work_fn+0x12/0x20 [95895.789065] [] process_one_work+0x23e/0x6f0 [95895.789741] [] ? process_one_work+0x1ba/0x6f0 [95895.790444] [] worker_thread+0x4e/0x490 [95895.791178] [] ? process_one_work+0x6f0/0x6f0 [95895.791911] [] ? process_one_work+0x6f0/0x6f0 [95895.792653] [] ? do_syscall_64+0x6c/0x1f0 [95895.793397] [] kthread+0x102/0x120 [95895.794212] [] ? trace_hardirqs_on_caller+0xf5/0x1b0 [95895.794942] [] ? kthread_park+0x60/0x60 [95895.795689
Re: Still OOM problems with 4.9er kernels
On 23.12.2016 03:55, Minchan Kim wrote: On Fri, Dec 09, 2016 at 04:52:07PM +0100, Gerhard Wiesinger wrote: On 09.12.2016 14:40, Michal Hocko wrote: On Fri 09-12-16 08:06:25, Gerhard Wiesinger wrote: Hello, same with latest kernel rc, dnf still killed with OOM (but sometimes better). ./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} (does dnf clean all;dnf update) Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Updated bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Could you post your oom report please? E.g. a new one with more than one included, first one after boot ... Just setup a low mem VM under KVM and it is easily triggerable. Still enough virtual memory available ... 4.9.0-0.rc8.git2.1.fc26.x86_64 [ 624.862777] ksoftirqd/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC) [ 624.863319] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [ 624.863410] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [ 624.863510] aa62c007f958 904774e3 90c7dd98 [ 624.863923] aa62c007f9e0 9020e6ea 020800200246 90c7dd98 [ 624.864019] aa62c007f980 96b90010 aa62c007f9f0 aa62c007f9a0 [ 624.864998] Call Trace: [ 624.865149] [] dump_stack+0x86/0xc3 [ 624.865347] [] warn_alloc+0x13a/0x170 [ 624.865432] [] __alloc_pages_slowpath+0x252/0xbb0 [ 624.865563] [] __alloc_pages_nodemask+0x40d/0x4b0 [ 624.865675] [] __alloc_page_frag+0x193/0x200 [ 624.866024] [] __napi_alloc_skb+0x8e/0xf0 [ 624.866113] [] page_to_skb.isra.28+0x5d/0x310 [virtio_net] [ 624.866201] [] virtnet_receive+0x2db/0x9a0 [virtio_net] [ 624.867378] [] virtnet_poll+0x1d/0x80 [virtio_net] [ 624.867494] [] net_rx_action+0x23e/0x470 [ 624.867612] [] __do_softirq+0xcd/0x4b9 [ 624.867704] [] ? smpboot_thread_fn+0x34/0x1f0 [ 624.867833] [] ? smpboot_thread_fn+0x12d/0x1f0 [ 624.867924] [] run_ksoftirqd+0x25/0x80 [ 624.868109] [] smpboot_thread_fn+0x128/0x1f0 [ 624.868197] [] ? sort_range+0x30/0x30 [ 624.868596] [] kthread+0x102/0x120 [ 624.868679] [] ? wait_for_completion+0x110/0x140 [ 624.868768] [] ? kthread_park+0x60/0x60 [ 624.868850] [] ret_from_fork+0x2a/0x40 [ 843.528656] httpd (2490) used greatest stack depth: 10304 bytes left [ 878.077750] httpd (2976) used greatest stack depth: 10096 bytes left [93918.861109] netstat (14579) used greatest stack depth: 9488 bytes left [94050.874669] kworker/dying (6253) used greatest stack depth: 9008 bytes left [95895.765570] kworker/1:1H: page allocation failure: order:0, mode:0x2280020(GFP_ATOMIC|__GFP_NOTRACK) [95895.765819] CPU: 1 PID: 440 Comm: kworker/1:1H Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [95895.765911] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [95895.766060] Workqueue: kblockd blk_mq_run_work_fn [95895.766143] aa62c0257628 904774e3 90c7dd98 [95895.766235] aa62c02576b0 9020e6ea 022800200046 90c7dd98 [95895.766325] aa62c0257650 96b90010 aa62c02576c0 aa62c0257670 [95895.766417] Call Trace: [95895.766502] [] dump_stack+0x86/0xc3 [95895.766596] [] warn_alloc+0x13a/0x170 [95895.766681] [] __alloc_pages_slowpath+0x252/0xbb0 [95895.766767] [] __alloc_pages_nodemask+0x40d/0x4b0 [95895.766866] [] alloc_pages_current+0xa1/0x1f0 [95895.766971] [] ? _raw_spin_unlock+0x27/0x40 [95895.767073] [] new_slab+0x316/0x7c0 [95895.767160] [] ___slab_alloc+0x3fb/0x5c0 [95895.772611] [] ? cpuacct_charge+0xf2/0x1f0 [95895.773406] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.774327] [] ? rcu_read_lock_sched_held+0x45/0x80 [95895.775212] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.776155] [] __slab_alloc+0x51/0x90 [95895.777090] [] __kmalloc+0x251/0x320 [95895.781502] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.782309] [] alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.783334] [] virtqueue_add_sgs+0x1c3/0x4a0 [virtio_ring] [95895.784059] [] ? kvm_sched_clock_read+0x25/0x40 [95895.784742] [] __virtblk_add_req+0xbc/0x220 [virtio_blk] [95895.785419] [] ? debug_lockdep_rcu_enabled+0x1d/0x20 [95895.786086] [] ? virtio_queue_rq+0x105/0x290 [virtio_blk] [95895.786750] [] virtio_queue_rq+0x12d/0x290 [virtio_blk] [95895.787427] [] __blk_mq_run_hw_queue+0x26d/0x3b0 [95895.788106] [] blk_mq_run_work_fn+0x12/0x20 [95895.789065] [] process_one_work+0x23e/0x6f0 [95895.789741] [] ? process_one_work+0x1ba/0x6f0 [95895.790444] [] worker_thread+0x4e/0x490 [95895.791178] [] ? process_one_work+0x6f0/0x6f0 [95895.791911] [] ? process_one_work+0x6f0/0x6f0 [95895.792653] [] ? do_syscall_64+0x6c/0x1f0 [95895.793397] [] kthread+0x102/0x120 [95895.794212] [] ? trace_hardirqs_on_caller+0xf5/0x1b0 [95895.794942] [] ? kthread_park+0x60/0x60 [95895.795689
Re: Still OOM problems with 4.9er kernels
On 23.12.2016 03:55, Minchan Kim wrote: On Fri, Dec 09, 2016 at 04:52:07PM +0100, Gerhard Wiesinger wrote: On 09.12.2016 14:40, Michal Hocko wrote: On Fri 09-12-16 08:06:25, Gerhard Wiesinger wrote: Hello, same with latest kernel rc, dnf still killed with OOM (but sometimes better). ./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} (does dnf clean all;dnf update) Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Updated bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Could you post your oom report please? E.g. a new one with more than one included, first one after boot ... Just setup a low mem VM under KVM and it is easily triggerable. Still enough virtual memory available ... 4.9.0-0.rc8.git2.1.fc26.x86_64 [ 624.862777] ksoftirqd/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC) [ 624.863319] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [ 624.863410] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [ 624.863510] aa62c007f958 904774e3 90c7dd98 [ 624.863923] aa62c007f9e0 9020e6ea 020800200246 90c7dd98 [ 624.864019] aa62c007f980 96b90010 aa62c007f9f0 aa62c007f9a0 [ 624.864998] Call Trace: [ 624.865149] [] dump_stack+0x86/0xc3 [ 624.865347] [] warn_alloc+0x13a/0x170 [ 624.865432] [] __alloc_pages_slowpath+0x252/0xbb0 [ 624.865563] [] __alloc_pages_nodemask+0x40d/0x4b0 [ 624.865675] [] __alloc_page_frag+0x193/0x200 [ 624.866024] [] __napi_alloc_skb+0x8e/0xf0 [ 624.866113] [] page_to_skb.isra.28+0x5d/0x310 [virtio_net] [ 624.866201] [] virtnet_receive+0x2db/0x9a0 [virtio_net] [ 624.867378] [] virtnet_poll+0x1d/0x80 [virtio_net] [ 624.867494] [] net_rx_action+0x23e/0x470 [ 624.867612] [] __do_softirq+0xcd/0x4b9 [ 624.867704] [] ? smpboot_thread_fn+0x34/0x1f0 [ 624.867833] [] ? smpboot_thread_fn+0x12d/0x1f0 [ 624.867924] [] run_ksoftirqd+0x25/0x80 [ 624.868109] [] smpboot_thread_fn+0x128/0x1f0 [ 624.868197] [] ? sort_range+0x30/0x30 [ 624.868596] [] kthread+0x102/0x120 [ 624.868679] [] ? wait_for_completion+0x110/0x140 [ 624.868768] [] ? kthread_park+0x60/0x60 [ 624.868850] [] ret_from_fork+0x2a/0x40 [ 843.528656] httpd (2490) used greatest stack depth: 10304 bytes left [ 878.077750] httpd (2976) used greatest stack depth: 10096 bytes left [93918.861109] netstat (14579) used greatest stack depth: 9488 bytes left [94050.874669] kworker/dying (6253) used greatest stack depth: 9008 bytes left [95895.765570] kworker/1:1H: page allocation failure: order:0, mode:0x2280020(GFP_ATOMIC|__GFP_NOTRACK) [95895.765819] CPU: 1 PID: 440 Comm: kworker/1:1H Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [95895.765911] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [95895.766060] Workqueue: kblockd blk_mq_run_work_fn [95895.766143] aa62c0257628 904774e3 90c7dd98 [95895.766235] aa62c02576b0 9020e6ea 022800200046 90c7dd98 [95895.766325] aa62c0257650 96b90010 aa62c02576c0 aa62c0257670 [95895.766417] Call Trace: [95895.766502] [] dump_stack+0x86/0xc3 [95895.766596] [] warn_alloc+0x13a/0x170 [95895.766681] [] __alloc_pages_slowpath+0x252/0xbb0 [95895.766767] [] __alloc_pages_nodemask+0x40d/0x4b0 [95895.766866] [] alloc_pages_current+0xa1/0x1f0 [95895.766971] [] ? _raw_spin_unlock+0x27/0x40 [95895.767073] [] new_slab+0x316/0x7c0 [95895.767160] [] ___slab_alloc+0x3fb/0x5c0 [95895.772611] [] ? cpuacct_charge+0xf2/0x1f0 [95895.773406] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.774327] [] ? rcu_read_lock_sched_held+0x45/0x80 [95895.775212] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.776155] [] __slab_alloc+0x51/0x90 [95895.777090] [] __kmalloc+0x251/0x320 [95895.781502] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.782309] [] alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.783334] [] virtqueue_add_sgs+0x1c3/0x4a0 [virtio_ring] [95895.784059] [] ? kvm_sched_clock_read+0x25/0x40 [95895.784742] [] __virtblk_add_req+0xbc/0x220 [virtio_blk] [95895.785419] [] ? debug_lockdep_rcu_enabled+0x1d/0x20 [95895.786086] [] ? virtio_queue_rq+0x105/0x290 [virtio_blk] [95895.786750] [] virtio_queue_rq+0x12d/0x290 [virtio_blk] [95895.787427] [] __blk_mq_run_hw_queue+0x26d/0x3b0 [95895.788106] [] blk_mq_run_work_fn+0x12/0x20 [95895.789065] [] process_one_work+0x23e/0x6f0 [95895.789741] [] ? process_one_work+0x1ba/0x6f0 [95895.790444] [] worker_thread+0x4e/0x490 [95895.791178] [] ? process_one_work+0x6f0/0x6f0 [95895.791911] [] ? process_one_work+0x6f0/0x6f0 [95895.792653] [] ? do_syscall_64+0x6c/0x1f0 [95895.793397] [] kthread+0x102/0x120 [95895.794212] [] ? trace_hardirqs_on_caller+0xf5/0x1b0 [95895.794942] [] ? kthread_park+0x60/0x60 [95895.795689
Re: Still OOM problems with 4.9er kernels
On 23.12.2016 03:55, Minchan Kim wrote: On Fri, Dec 09, 2016 at 04:52:07PM +0100, Gerhard Wiesinger wrote: On 09.12.2016 14:40, Michal Hocko wrote: On Fri 09-12-16 08:06:25, Gerhard Wiesinger wrote: Hello, same with latest kernel rc, dnf still killed with OOM (but sometimes better). ./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} (does dnf clean all;dnf update) Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Updated bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Could you post your oom report please? E.g. a new one with more than one included, first one after boot ... Just setup a low mem VM under KVM and it is easily triggerable. Still enough virtual memory available ... 4.9.0-0.rc8.git2.1.fc26.x86_64 [ 624.862777] ksoftirqd/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC) [ 624.863319] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [ 624.863410] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [ 624.863510] aa62c007f958 904774e3 90c7dd98 [ 624.863923] aa62c007f9e0 9020e6ea 020800200246 90c7dd98 [ 624.864019] aa62c007f980 96b90010 aa62c007f9f0 aa62c007f9a0 [ 624.864998] Call Trace: [ 624.865149] [] dump_stack+0x86/0xc3 [ 624.865347] [] warn_alloc+0x13a/0x170 [ 624.865432] [] __alloc_pages_slowpath+0x252/0xbb0 [ 624.865563] [] __alloc_pages_nodemask+0x40d/0x4b0 [ 624.865675] [] __alloc_page_frag+0x193/0x200 [ 624.866024] [] __napi_alloc_skb+0x8e/0xf0 [ 624.866113] [] page_to_skb.isra.28+0x5d/0x310 [virtio_net] [ 624.866201] [] virtnet_receive+0x2db/0x9a0 [virtio_net] [ 624.867378] [] virtnet_poll+0x1d/0x80 [virtio_net] [ 624.867494] [] net_rx_action+0x23e/0x470 [ 624.867612] [] __do_softirq+0xcd/0x4b9 [ 624.867704] [] ? smpboot_thread_fn+0x34/0x1f0 [ 624.867833] [] ? smpboot_thread_fn+0x12d/0x1f0 [ 624.867924] [] run_ksoftirqd+0x25/0x80 [ 624.868109] [] smpboot_thread_fn+0x128/0x1f0 [ 624.868197] [] ? sort_range+0x30/0x30 [ 624.868596] [] kthread+0x102/0x120 [ 624.868679] [] ? wait_for_completion+0x110/0x140 [ 624.868768] [] ? kthread_park+0x60/0x60 [ 624.868850] [] ret_from_fork+0x2a/0x40 [ 843.528656] httpd (2490) used greatest stack depth: 10304 bytes left [ 878.077750] httpd (2976) used greatest stack depth: 10096 bytes left [93918.861109] netstat (14579) used greatest stack depth: 9488 bytes left [94050.874669] kworker/dying (6253) used greatest stack depth: 9008 bytes left [95895.765570] kworker/1:1H: page allocation failure: order:0, mode:0x2280020(GFP_ATOMIC|__GFP_NOTRACK) [95895.765819] CPU: 1 PID: 440 Comm: kworker/1:1H Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [95895.765911] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [95895.766060] Workqueue: kblockd blk_mq_run_work_fn [95895.766143] aa62c0257628 904774e3 90c7dd98 [95895.766235] aa62c02576b0 9020e6ea 022800200046 90c7dd98 [95895.766325] aa62c0257650 96b90010 aa62c02576c0 aa62c0257670 [95895.766417] Call Trace: [95895.766502] [] dump_stack+0x86/0xc3 [95895.766596] [] warn_alloc+0x13a/0x170 [95895.766681] [] __alloc_pages_slowpath+0x252/0xbb0 [95895.766767] [] __alloc_pages_nodemask+0x40d/0x4b0 [95895.766866] [] alloc_pages_current+0xa1/0x1f0 [95895.766971] [] ? _raw_spin_unlock+0x27/0x40 [95895.767073] [] new_slab+0x316/0x7c0 [95895.767160] [] ___slab_alloc+0x3fb/0x5c0 [95895.772611] [] ? cpuacct_charge+0xf2/0x1f0 [95895.773406] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.774327] [] ? rcu_read_lock_sched_held+0x45/0x80 [95895.775212] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.776155] [] __slab_alloc+0x51/0x90 [95895.777090] [] __kmalloc+0x251/0x320 [95895.781502] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.782309] [] alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.783334] [] virtqueue_add_sgs+0x1c3/0x4a0 [virtio_ring] [95895.784059] [] ? kvm_sched_clock_read+0x25/0x40 [95895.784742] [] __virtblk_add_req+0xbc/0x220 [virtio_blk] [95895.785419] [] ? debug_lockdep_rcu_enabled+0x1d/0x20 [95895.786086] [] ? virtio_queue_rq+0x105/0x290 [virtio_blk] [95895.786750] [] virtio_queue_rq+0x12d/0x290 [virtio_blk] [95895.787427] [] __blk_mq_run_hw_queue+0x26d/0x3b0 [95895.788106] [] blk_mq_run_work_fn+0x12/0x20 [95895.789065] [] process_one_work+0x23e/0x6f0 [95895.789741] [] ? process_one_work+0x1ba/0x6f0 [95895.790444] [] worker_thread+0x4e/0x490 [95895.791178] [] ? process_one_work+0x6f0/0x6f0 [95895.791911] [] ? process_one_work+0x6f0/0x6f0 [95895.792653] [] ? do_syscall_64+0x6c/0x1f0 [95895.793397] [] kthread+0x102/0x120 [95895.794212] [] ? trace_hardirqs_on_caller+0xf5/0x1b0 [95895.794942] [] ? kthread_park+0x60/0x60 [95895.795689
Re: Still OOM problems with 4.9er kernels
On 09.12.2016 22:42, Vlastimil Babka wrote: On 12/09/2016 07:01 PM, Gerhard Wiesinger wrote: On 09.12.2016 18:30, Michal Hocko wrote: On Fri 09-12-16 17:58:14, Gerhard Wiesinger wrote: On 09.12.2016 17:09, Michal Hocko wrote: [...] [97883.882611] Mem-Info: [97883.883747] active_anon:2915 inactive_anon:3376 isolated_anon:0 active_file:3902 inactive_file:3639 isolated_file:0 unevictable:0 dirty:205 writeback:0 unstable:0 slab_reclaimable:9856 slab_unreclaimable:9682 mapped:3722 shmem:59 pagetables:2080 bounce:0 free:748 free_pcp:15 free_cma:0 there is still some page cache which doesn't seem to be neither dirty nor under writeback. So it should be theoretically reclaimable but for some reason we cannot seem to reclaim that memory. There is still some anonymous memory and free swap so we could reclaim it as well but it all seems pretty down and the memory pressure is really large Yes, it might be large on the update situation, but that should be handled by a virtual memory system by the kernel, right? Well this is what we try and call it memory reclaim. But if we are not able to reclaim anything then we eventually have to give up and trigger the OOM killer. I'm not familiar with the Linux implementation of the VM system in detail. But can't you reserve as much memory for the kernel (non pageable) at least that you can swap everything out (even without killing a process at least as long there is enough swap available, which should be in all of my cases)? We don't have such bulletproof reserves. In this case the amount of anonymous memory that can be swapped out is relatively low, and either something is pinning it in memory, or it's being swapped back in quickly. Now the information that 4.4 made a difference is interesting. I do not really see any major differences in the reclaim between 4.3 and 4.4 kernels. The reason might be somewhere else as well. E.g. some of the subsystem consumes much more memory than before. Just curious, what kind of filesystem are you using? I'm using ext4 only with virt-* drivers (storage, network). But it is definitly a virtual memory allocation/swap usage issue. Could you try some additional debugging. Enabling reclaim related tracepoints might tell us more. The following should tell us more mount -t tracefs none /trace echo 1 > /trace/events/vmscan/enable echo 1 > /trace/events/writeback/writeback_congestion_wait/enable cat /trace/trace_pipe > trace.log Collecting /proc/vmstat over time might be helpful as well mkdir logs while true do cp /proc/vmstat vmstat.$(date +%s) sleep 1s done Activated it. But I think it should be very easy to trigger also on your side. A very small configured VM with a program running RAM allocations/writes (I guess you have some testing programs already) should be sufficient to trigger it. You can also use the attached program which I used to trigger such situations some years ago. If it doesn't help try to reduce the available CPU for the VM and also I/O (e.g. use all CPU/IO on the host or other VMs). Well it's not really a surprise that if the VM is small enough and workload large enough, OOM killer will kick in. The exact threshold might have changed between kernel versions for a number of possible reasons. IMHO: The OOM killer should NOT kick in even on the highest workloads if there is swap available. https://www.spinics.net/lists/linux-mm/msg113665.html Yeah, but I do think that "oom when you have 156MB free and 7GB reclaimable, and haven't even tried swapping" counts as obviously wrong. So Linus also thinks that trying swapping is a must have. And there always was enough swap available in my cases. Then it should swap out/swapin all the time (which worked well in kernel 2.4/2.6 times). Another topic: Why does the kernel prefer to swap in/swap out instead of use cache pages/buffers (see vmstat 1 output below)? BTW: Don't know if you have seen also my original message on the kernel mailinglist only: Linus had also OOM problems with 1kB RAM requests and a lot of free RAM (use a translation service for the german page): https://lkml.org/lkml/2016/11/30/64 https://marius.bloggt-in-braunschweig.de/2016/11/17/linuxkernel-4-74-8-und-der-oom-killer/ https://www.spinics.net/lists/linux-mm/msg113661.html Yeah we were involved in the last one. The regressions were about high-order allocations though (the 1kB premise turned out to be misinterpretation) and there were regressions for those in 4.7/4.8. But yours are order-0. With kernel 4.7./4.8 it was really reaproduceable at every dnf update. With 4.9rc8 it has been much much better. So something must have changed, too. As far as I understood it the order is 2^order kB pagesize. I don't think it makes a difference when swap is not used which order the memory allocation request is. BTW: What were the commit that introduced
Re: Still OOM problems with 4.9er kernels
On 09.12.2016 22:42, Vlastimil Babka wrote: On 12/09/2016 07:01 PM, Gerhard Wiesinger wrote: On 09.12.2016 18:30, Michal Hocko wrote: On Fri 09-12-16 17:58:14, Gerhard Wiesinger wrote: On 09.12.2016 17:09, Michal Hocko wrote: [...] [97883.882611] Mem-Info: [97883.883747] active_anon:2915 inactive_anon:3376 isolated_anon:0 active_file:3902 inactive_file:3639 isolated_file:0 unevictable:0 dirty:205 writeback:0 unstable:0 slab_reclaimable:9856 slab_unreclaimable:9682 mapped:3722 shmem:59 pagetables:2080 bounce:0 free:748 free_pcp:15 free_cma:0 there is still some page cache which doesn't seem to be neither dirty nor under writeback. So it should be theoretically reclaimable but for some reason we cannot seem to reclaim that memory. There is still some anonymous memory and free swap so we could reclaim it as well but it all seems pretty down and the memory pressure is really large Yes, it might be large on the update situation, but that should be handled by a virtual memory system by the kernel, right? Well this is what we try and call it memory reclaim. But if we are not able to reclaim anything then we eventually have to give up and trigger the OOM killer. I'm not familiar with the Linux implementation of the VM system in detail. But can't you reserve as much memory for the kernel (non pageable) at least that you can swap everything out (even without killing a process at least as long there is enough swap available, which should be in all of my cases)? We don't have such bulletproof reserves. In this case the amount of anonymous memory that can be swapped out is relatively low, and either something is pinning it in memory, or it's being swapped back in quickly. Now the information that 4.4 made a difference is interesting. I do not really see any major differences in the reclaim between 4.3 and 4.4 kernels. The reason might be somewhere else as well. E.g. some of the subsystem consumes much more memory than before. Just curious, what kind of filesystem are you using? I'm using ext4 only with virt-* drivers (storage, network). But it is definitly a virtual memory allocation/swap usage issue. Could you try some additional debugging. Enabling reclaim related tracepoints might tell us more. The following should tell us more mount -t tracefs none /trace echo 1 > /trace/events/vmscan/enable echo 1 > /trace/events/writeback/writeback_congestion_wait/enable cat /trace/trace_pipe > trace.log Collecting /proc/vmstat over time might be helpful as well mkdir logs while true do cp /proc/vmstat vmstat.$(date +%s) sleep 1s done Activated it. But I think it should be very easy to trigger also on your side. A very small configured VM with a program running RAM allocations/writes (I guess you have some testing programs already) should be sufficient to trigger it. You can also use the attached program which I used to trigger such situations some years ago. If it doesn't help try to reduce the available CPU for the VM and also I/O (e.g. use all CPU/IO on the host or other VMs). Well it's not really a surprise that if the VM is small enough and workload large enough, OOM killer will kick in. The exact threshold might have changed between kernel versions for a number of possible reasons. IMHO: The OOM killer should NOT kick in even on the highest workloads if there is swap available. https://www.spinics.net/lists/linux-mm/msg113665.html Yeah, but I do think that "oom when you have 156MB free and 7GB reclaimable, and haven't even tried swapping" counts as obviously wrong. So Linus also thinks that trying swapping is a must have. And there always was enough swap available in my cases. Then it should swap out/swapin all the time (which worked well in kernel 2.4/2.6 times). Another topic: Why does the kernel prefer to swap in/swap out instead of use cache pages/buffers (see vmstat 1 output below)? BTW: Don't know if you have seen also my original message on the kernel mailinglist only: Linus had also OOM problems with 1kB RAM requests and a lot of free RAM (use a translation service for the german page): https://lkml.org/lkml/2016/11/30/64 https://marius.bloggt-in-braunschweig.de/2016/11/17/linuxkernel-4-74-8-und-der-oom-killer/ https://www.spinics.net/lists/linux-mm/msg113661.html Yeah we were involved in the last one. The regressions were about high-order allocations though (the 1kB premise turned out to be misinterpretation) and there were regressions for those in 4.7/4.8. But yours are order-0. With kernel 4.7./4.8 it was really reaproduceable at every dnf update. With 4.9rc8 it has been much much better. So something must have changed, too. As far as I understood it the order is 2^order kB pagesize. I don't think it makes a difference when swap is not used which order the memory allocation request is. BTW: What were the commit that introduced
Re: Still OOM problems with 4.9er kernels
On 09.12.2016 18:30, Michal Hocko wrote: On Fri 09-12-16 17:58:14, Gerhard Wiesinger wrote: On 09.12.2016 17:09, Michal Hocko wrote: [...] [97883.882611] Mem-Info: [97883.883747] active_anon:2915 inactive_anon:3376 isolated_anon:0 active_file:3902 inactive_file:3639 isolated_file:0 unevictable:0 dirty:205 writeback:0 unstable:0 slab_reclaimable:9856 slab_unreclaimable:9682 mapped:3722 shmem:59 pagetables:2080 bounce:0 free:748 free_pcp:15 free_cma:0 there is still some page cache which doesn't seem to be neither dirty nor under writeback. So it should be theoretically reclaimable but for some reason we cannot seem to reclaim that memory. There is still some anonymous memory and free swap so we could reclaim it as well but it all seems pretty down and the memory pressure is really large Yes, it might be large on the update situation, but that should be handled by a virtual memory system by the kernel, right? Well this is what we try and call it memory reclaim. But if we are not able to reclaim anything then we eventually have to give up and trigger the OOM killer. I'm not familiar with the Linux implementation of the VM system in detail. But can't you reserve as much memory for the kernel (non pageable) at least that you can swap everything out (even without killing a process at least as long there is enough swap available, which should be in all of my cases)? Now the information that 4.4 made a difference is interesting. I do not really see any major differences in the reclaim between 4.3 and 4.4 kernels. The reason might be somewhere else as well. E.g. some of the subsystem consumes much more memory than before. Just curious, what kind of filesystem are you using? I'm using ext4 only with virt-* drivers (storage, network). But it is definitly a virtual memory allocation/swap usage issue. Could you try some additional debugging. Enabling reclaim related tracepoints might tell us more. The following should tell us more mount -t tracefs none /trace echo 1 > /trace/events/vmscan/enable echo 1 > /trace/events/writeback/writeback_congestion_wait/enable cat /trace/trace_pipe > trace.log Collecting /proc/vmstat over time might be helpful as well mkdir logs while true do cp /proc/vmstat vmstat.$(date +%s) sleep 1s done Activated it. But I think it should be very easy to trigger also on your side. A very small configured VM with a program running RAM allocations/writes (I guess you have some testing programs already) should be sufficient to trigger it. You can also use the attached program which I used to trigger such situations some years ago. If it doesn't help try to reduce the available CPU for the VM and also I/O (e.g. use all CPU/IO on the host or other VMs). BTW: Don't know if you have seen also my original message on the kernel mailinglist only: Linus had also OOM problems with 1kB RAM requests and a lot of free RAM (use a translation service for the german page): https://lkml.org/lkml/2016/11/30/64 https://marius.bloggt-in-braunschweig.de/2016/11/17/linuxkernel-4-74-8-und-der-oom-killer/ https://www.spinics.net/lists/linux-mm/msg113661.html Thnx. Ciao, Gerhard // mallocsleep.c #include #include #include typedef unsigned int BOOL; typedef char* PCHAR; typedef unsigned int DWORD; typedef unsigned long DDWORD; #define FALSE 0 #define TRUE 1 BOOL getlong(PCHAR s, DDWORD* retvalue) { char *eptr; long value; value=strtoll(s,,0); if ((eptr == s)||(*eptr != '\0')) return FALSE; if (value < 0) return FALSE; *retvalue = value; return TRUE; } int main(int argc, char* argv[]) { unsigned long* p; unsigned long size = 16*1024*1024; unsigned long size_of = sizeof(*p); unsigned long i; unsigned long sleep_allocated = 3600; unsigned long sleep_freed = 3600; if (argc > 1) { if (!getlong(argv[1], )) { printf("Wrong memsize!\n"); exit(1); } } if (argc > 2) { if (!getlong(argv[2], _allocated)) { printf("Wrong sleep_allocated time!\n"); exit(1); } } if (argc > 3) { if (!getlong(argv[3], _freed)) { printf("Wrong sleep_freed time!\n"); exit(1); } } printf("size=%lu, size_of=%lu\n", size, size_of); fflush(stdout); p = malloc(size); if (!p) { printf("Could not allocate memory!\n"); exit(2); } printf("malloc done, writing to memory, p=%p ...\n", (void*)p); fflush(stdout); for(i = 0;i < (size/size_of);i++) p[i]=i; printf("writing to memory done, sleeping for %lu seconds ...\n", sleep_allocated); fflush(stdout); sleep(sleep_allocated); printf("sleeping done, freeing ...\n"); fflush(stdout); free(p); printf("freeing done, sleeping for %lu seconds ...\n", sleep_freed); fflush(stdout); sleep(sleep_freed); printf("sleeping done, exitiing ...\n"); fflush(stdout); exit(0); return 0; }
Re: Still OOM problems with 4.9er kernels
On 09.12.2016 18:30, Michal Hocko wrote: On Fri 09-12-16 17:58:14, Gerhard Wiesinger wrote: On 09.12.2016 17:09, Michal Hocko wrote: [...] [97883.882611] Mem-Info: [97883.883747] active_anon:2915 inactive_anon:3376 isolated_anon:0 active_file:3902 inactive_file:3639 isolated_file:0 unevictable:0 dirty:205 writeback:0 unstable:0 slab_reclaimable:9856 slab_unreclaimable:9682 mapped:3722 shmem:59 pagetables:2080 bounce:0 free:748 free_pcp:15 free_cma:0 there is still some page cache which doesn't seem to be neither dirty nor under writeback. So it should be theoretically reclaimable but for some reason we cannot seem to reclaim that memory. There is still some anonymous memory and free swap so we could reclaim it as well but it all seems pretty down and the memory pressure is really large Yes, it might be large on the update situation, but that should be handled by a virtual memory system by the kernel, right? Well this is what we try and call it memory reclaim. But if we are not able to reclaim anything then we eventually have to give up and trigger the OOM killer. I'm not familiar with the Linux implementation of the VM system in detail. But can't you reserve as much memory for the kernel (non pageable) at least that you can swap everything out (even without killing a process at least as long there is enough swap available, which should be in all of my cases)? Now the information that 4.4 made a difference is interesting. I do not really see any major differences in the reclaim between 4.3 and 4.4 kernels. The reason might be somewhere else as well. E.g. some of the subsystem consumes much more memory than before. Just curious, what kind of filesystem are you using? I'm using ext4 only with virt-* drivers (storage, network). But it is definitly a virtual memory allocation/swap usage issue. Could you try some additional debugging. Enabling reclaim related tracepoints might tell us more. The following should tell us more mount -t tracefs none /trace echo 1 > /trace/events/vmscan/enable echo 1 > /trace/events/writeback/writeback_congestion_wait/enable cat /trace/trace_pipe > trace.log Collecting /proc/vmstat over time might be helpful as well mkdir logs while true do cp /proc/vmstat vmstat.$(date +%s) sleep 1s done Activated it. But I think it should be very easy to trigger also on your side. A very small configured VM with a program running RAM allocations/writes (I guess you have some testing programs already) should be sufficient to trigger it. You can also use the attached program which I used to trigger such situations some years ago. If it doesn't help try to reduce the available CPU for the VM and also I/O (e.g. use all CPU/IO on the host or other VMs). BTW: Don't know if you have seen also my original message on the kernel mailinglist only: Linus had also OOM problems with 1kB RAM requests and a lot of free RAM (use a translation service for the german page): https://lkml.org/lkml/2016/11/30/64 https://marius.bloggt-in-braunschweig.de/2016/11/17/linuxkernel-4-74-8-und-der-oom-killer/ https://www.spinics.net/lists/linux-mm/msg113661.html Thnx. Ciao, Gerhard // mallocsleep.c #include #include #include typedef unsigned int BOOL; typedef char* PCHAR; typedef unsigned int DWORD; typedef unsigned long DDWORD; #define FALSE 0 #define TRUE 1 BOOL getlong(PCHAR s, DDWORD* retvalue) { char *eptr; long value; value=strtoll(s,,0); if ((eptr == s)||(*eptr != '\0')) return FALSE; if (value < 0) return FALSE; *retvalue = value; return TRUE; } int main(int argc, char* argv[]) { unsigned long* p; unsigned long size = 16*1024*1024; unsigned long size_of = sizeof(*p); unsigned long i; unsigned long sleep_allocated = 3600; unsigned long sleep_freed = 3600; if (argc > 1) { if (!getlong(argv[1], )) { printf("Wrong memsize!\n"); exit(1); } } if (argc > 2) { if (!getlong(argv[2], _allocated)) { printf("Wrong sleep_allocated time!\n"); exit(1); } } if (argc > 3) { if (!getlong(argv[3], _freed)) { printf("Wrong sleep_freed time!\n"); exit(1); } } printf("size=%lu, size_of=%lu\n", size, size_of); fflush(stdout); p = malloc(size); if (!p) { printf("Could not allocate memory!\n"); exit(2); } printf("malloc done, writing to memory, p=%p ...\n", (void*)p); fflush(stdout); for(i = 0;i < (size/size_of);i++) p[i]=i; printf("writing to memory done, sleeping for %lu seconds ...\n", sleep_allocated); fflush(stdout); sleep(sleep_allocated); printf("sleeping done, freeing ...\n"); fflush(stdout); free(p); printf("freeing done, sleeping for %lu seconds ...\n", sleep_freed); fflush(stdout); sleep(sleep_freed); printf("sleeping done, exitiing ...\n"); fflush(stdout); exit(0); return 0; }
Re: Still OOM problems with 4.9er kernels
On 09.12.2016 17:09, Michal Hocko wrote: On Fri 09-12-16 16:52:07, Gerhard Wiesinger wrote: On 09.12.2016 14:40, Michal Hocko wrote: On Fri 09-12-16 08:06:25, Gerhard Wiesinger wrote: Hello, same with latest kernel rc, dnf still killed with OOM (but sometimes better). ./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} (does dnf clean all;dnf update) Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Updated bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Could you post your oom report please? E.g. a new one with more than one included, first one after boot ... Just setup a low mem VM under KVM and it is easily triggerable. What is the workload? just run dnf clean all;dnf update (and the other tasks running on those machine. The normal load on most of these machines is pretty VERY LOW, e.g. running just an apache httpd doing nothing or e.g. running samba domain controller doing nothing) So my setups are low mem VMs so that KVM host has most of the caching effects shared. I'm running this setup since Fedora 17 under kernel-3.3.4-5.fc17.x86_64 and had NO problems. Problems started with 4.4.3-300.fc23.x86_64 and got worser in each major kernel versions (for upgrades I had even give the VMs temporarilly more memory for the upgrade situation). (from my bug report at https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Previous kernel version on guest/host was rocket stable. Revert to kernel-4.3.5-300.fc23.x86_64 also solved it.) For completeness the actual kernel parameters on all hosts and VMs. vm.dirty_background_ratio=3 vm.dirty_ratio=15 vm.overcommit_memory=2 vm.overcommit_ratio=80 vm.swappiness=10 With kernel 4.9.0rc7 or rc8 it was getting better. But still not there where it should be (and was already). Still enough virtual memory available ... Well, you will always have a lot of virtual memory... And why is it not used, e.g. swapped and gets into an OOM situation? 4.9.0-0.rc8.git2.1.fc26.x86_64 [ 624.862777] ksoftirqd/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC) [...] [95895.765570] kworker/1:1H: page allocation failure: order:0, mode:0x2280020(GFP_ATOMIC|__GFP_NOTRACK) These are atomic allocation failures and should be recoverable. [...] [97883.838418] httpd invoked oom-killer: gfp_mask=0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=0, order=0, oom_score_adj=0 But this is a real OOM killer invocation because a single page allocation cannot proceed. [...] [97883.882611] Mem-Info: [97883.883747] active_anon:2915 inactive_anon:3376 isolated_anon:0 active_file:3902 inactive_file:3639 isolated_file:0 unevictable:0 dirty:205 writeback:0 unstable:0 slab_reclaimable:9856 slab_unreclaimable:9682 mapped:3722 shmem:59 pagetables:2080 bounce:0 free:748 free_pcp:15 free_cma:0 there is still some page cache which doesn't seem to be neither dirty nor under writeback. So it should be theoretically reclaimable but for some reason we cannot seem to reclaim that memory. There is still some anonymous memory and free swap so we could reclaim it as well but it all seems pretty down and the memory pressure is really large Yes, it might be large on the update situation, but that should be handled by a virtual memory system by the kernel, right? [97883.890766] Node 0 active_anon:11660kB inactive_anon:13504kB active_file:15608kB inactive_file:14556kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:14888kB dirty:820kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 236kB writeback_tmp:0kB unstable:0kB pages_scanned:168352 all_unreclaimable? yes all_unreclaimable also agrees that basically nothing is reclaimable. That was one of the criterion to hit the OOM killer prior to the rewrite in 4.6 kernel. So I suspect that older kernels would OOM under your memory pressure as well. See comments above. Thnx. Ciao, Gerhard
Re: Still OOM problems with 4.9er kernels
On 09.12.2016 17:09, Michal Hocko wrote: On Fri 09-12-16 16:52:07, Gerhard Wiesinger wrote: On 09.12.2016 14:40, Michal Hocko wrote: On Fri 09-12-16 08:06:25, Gerhard Wiesinger wrote: Hello, same with latest kernel rc, dnf still killed with OOM (but sometimes better). ./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} (does dnf clean all;dnf update) Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Updated bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Could you post your oom report please? E.g. a new one with more than one included, first one after boot ... Just setup a low mem VM under KVM and it is easily triggerable. What is the workload? just run dnf clean all;dnf update (and the other tasks running on those machine. The normal load on most of these machines is pretty VERY LOW, e.g. running just an apache httpd doing nothing or e.g. running samba domain controller doing nothing) So my setups are low mem VMs so that KVM host has most of the caching effects shared. I'm running this setup since Fedora 17 under kernel-3.3.4-5.fc17.x86_64 and had NO problems. Problems started with 4.4.3-300.fc23.x86_64 and got worser in each major kernel versions (for upgrades I had even give the VMs temporarilly more memory for the upgrade situation). (from my bug report at https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Previous kernel version on guest/host was rocket stable. Revert to kernel-4.3.5-300.fc23.x86_64 also solved it.) For completeness the actual kernel parameters on all hosts and VMs. vm.dirty_background_ratio=3 vm.dirty_ratio=15 vm.overcommit_memory=2 vm.overcommit_ratio=80 vm.swappiness=10 With kernel 4.9.0rc7 or rc8 it was getting better. But still not there where it should be (and was already). Still enough virtual memory available ... Well, you will always have a lot of virtual memory... And why is it not used, e.g. swapped and gets into an OOM situation? 4.9.0-0.rc8.git2.1.fc26.x86_64 [ 624.862777] ksoftirqd/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC) [...] [95895.765570] kworker/1:1H: page allocation failure: order:0, mode:0x2280020(GFP_ATOMIC|__GFP_NOTRACK) These are atomic allocation failures and should be recoverable. [...] [97883.838418] httpd invoked oom-killer: gfp_mask=0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=0, order=0, oom_score_adj=0 But this is a real OOM killer invocation because a single page allocation cannot proceed. [...] [97883.882611] Mem-Info: [97883.883747] active_anon:2915 inactive_anon:3376 isolated_anon:0 active_file:3902 inactive_file:3639 isolated_file:0 unevictable:0 dirty:205 writeback:0 unstable:0 slab_reclaimable:9856 slab_unreclaimable:9682 mapped:3722 shmem:59 pagetables:2080 bounce:0 free:748 free_pcp:15 free_cma:0 there is still some page cache which doesn't seem to be neither dirty nor under writeback. So it should be theoretically reclaimable but for some reason we cannot seem to reclaim that memory. There is still some anonymous memory and free swap so we could reclaim it as well but it all seems pretty down and the memory pressure is really large Yes, it might be large on the update situation, but that should be handled by a virtual memory system by the kernel, right? [97883.890766] Node 0 active_anon:11660kB inactive_anon:13504kB active_file:15608kB inactive_file:14556kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:14888kB dirty:820kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 236kB writeback_tmp:0kB unstable:0kB pages_scanned:168352 all_unreclaimable? yes all_unreclaimable also agrees that basically nothing is reclaimable. That was one of the criterion to hit the OOM killer prior to the rewrite in 4.6 kernel. So I suspect that older kernels would OOM under your memory pressure as well. See comments above. Thnx. Ciao, Gerhard
Re: Still OOM problems with 4.9er kernels
On 09.12.2016 14:40, Michal Hocko wrote: On Fri 09-12-16 08:06:25, Gerhard Wiesinger wrote: Hello, same with latest kernel rc, dnf still killed with OOM (but sometimes better). ./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} (does dnf clean all;dnf update) Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Updated bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Could you post your oom report please? And another one which ended in a native_safe_halt [73366.837826] nmbd: page allocation failure: order:0, mode:0x2280030(GFP_ATOMIC|__GFP_RECLAIMABLE|__GFP_NOTRACK) [73366.837985] CPU: 1 PID: 2005 Comm: nmbd Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [73366.838075] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [73366.838175] aa4ac059f548 8d4774e3 8dc7dd98 [73366.838272] aa4ac059f5d0 8d20e6ea 022800300046 8dc7dd98 [73366.838364] aa4ac059f570 9c370010 aa4ac059f5e0 aa4ac059f590 [73366.838458] Call Trace: [73366.838590] [] dump_stack+0x86/0xc3 [73366.838680] [] warn_alloc+0x13a/0x170 [73366.838762] [] __alloc_pages_slowpath+0x252/0xbb0 [73366.838846] [] ? finish_task_switch+0xb0/0x260 [73366.838926] [] __alloc_pages_nodemask+0x40d/0x4b0 [73366.839007] [] alloc_pages_current+0xa1/0x1f0 [73366.839088] [] ? kvm_sched_clock_read+0x25/0x40 [73366.839170] [] new_slab+0x316/0x7c0 [73366.839245] [] ___slab_alloc+0x3fb/0x5c0 [73366.839325] [] ? kvm_sched_clock_read+0x25/0x40 [73366.839409] [] ? __es_insert_extent+0xb3/0x330 [73366.839501] [] ? __es_insert_extent+0xb3/0x330 [73366.839583] [] __slab_alloc+0x51/0x90 [73366.839662] [] ? __es_insert_extent+0xb3/0x330 [73366.839743] [] kmem_cache_alloc+0x246/0x2d0 [73366.839822] [] ? __es_remove_extent+0x56/0x2d0 [73366.839906] [] __es_insert_extent+0xb3/0x330 [73366.839985] [] ext4_es_insert_extent+0xee/0x280 [73366.840067] [] ? ext4_map_blocks+0x2b4/0x5f0 [73366.840147] [] ext4_map_blocks+0x323/0x5f0 [73366.840225] [] ? workingset_refault+0x10a/0x220 [73366.840314] [] ext4_mpage_readpages+0x413/0xa60 [73366.840397] [] ? __page_cache_alloc+0x146/0x190 [73366.840487] [] ext4_readpages+0x35/0x40 [73366.840569] [] __do_page_cache_readahead+0x2bf/0x390 [73366.840651] [] ? __do_page_cache_readahead+0x16a/0x390 [73366.840735] [] filemap_fault+0x51b/0x790 [73366.840814] [] ? ext4_filemap_fault+0x2e/0x50 [73366.840896] [] ext4_filemap_fault+0x39/0x50 [73366.840976] [] __do_fault+0x83/0x1d0 [73366.841056] [] handle_mm_fault+0x11e2/0x17a0 [73366.841138] [] ? handle_mm_fault+0x5a/0x17a0 [73366.841220] [] __do_page_fault+0x266/0x520 [73366.841300] [] trace_do_page_fault+0x58/0x2a0 [73366.841382] [] do_async_page_fault+0x1a/0xa0 [73366.841464] [] async_page_fault+0x28/0x30 [73366.842500] Mem-Info: [73366.843149] active_anon:8677 inactive_anon:8798 isolated_anon:0 active_file:328 inactive_file:317 isolated_file:32 unevictable:0 dirty:0 writeback:2 unstable:0 slab_reclaimable:4968 slab_unreclaimable:9242 mapped:365 shmem:1 pagetables:2690 bounce:0 free:764 free_pcp:41 free_cma:0 [73366.846832] Node 0 active_anon:34708kB inactive_anon:35192kB active_file:1312kB inactive_file:1268kB unevictable:0kB isolated(anon):0kB isolated(file):128kB mapped:1460kB dirty:0kB writeback:8kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 4kB writeback_tmp:0kB unstable:0kB pages_scanned:32 all_unreclaimable? no [73366.848711] Node 0 DMA free:1468kB min:172kB low:212kB high:252kB active_anon:3216kB inactive_anon:3448kB active_file:40kB inactive_file:228kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:2064kB slab_unreclaimable:2960kB kernel_stack:100kB pagetables:1536kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [73366.850769] lowmem_reserve[]: 0 116 116 116 116 [73366.851479] Node 0 DMA32 free:1588kB min:1296kB low:1620kB high:1944kB active_anon:31464kB inactive_anon:31740kB active_file:1236kB inactive_file:1056kB unevictable:0kB writepending:0kB present:180080kB managed:139012kB mlocked:0kB slab_reclaimable:17808kB slab_unreclaimable:34008kB kernel_stack:1676kB pagetables:9224kB bounce:0kB free_pcp:164kB local_pcp:12kB free_cma:0kB [73366.853757] lowmem_reserve[]: 0 0 0 0 0 [73366.854544] Node 0 DMA: 13*4kB (H) 13*8kB (H) 17*16kB (H) 12*32kB (H) 8*64kB (H) 1*128kB (H) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1452kB [73366.856200] Node 0 DMA32: 70*4kB (UMH) 12*8kB (MH) 12*16kB (H) 2*32kB (H) 5*64kB (H) 5*128kB (H) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1592kB [73366.857955] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [73366.857956] 2401 total pagecache pages [73366.858829] 1741 pages in swap cache [73366.859721] Swap cache stats: add
Re: Still OOM problems with 4.9er kernels
On 09.12.2016 14:40, Michal Hocko wrote: On Fri 09-12-16 08:06:25, Gerhard Wiesinger wrote: Hello, same with latest kernel rc, dnf still killed with OOM (but sometimes better). ./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} (does dnf clean all;dnf update) Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Updated bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Could you post your oom report please? And another one which ended in a native_safe_halt [73366.837826] nmbd: page allocation failure: order:0, mode:0x2280030(GFP_ATOMIC|__GFP_RECLAIMABLE|__GFP_NOTRACK) [73366.837985] CPU: 1 PID: 2005 Comm: nmbd Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [73366.838075] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [73366.838175] aa4ac059f548 8d4774e3 8dc7dd98 [73366.838272] aa4ac059f5d0 8d20e6ea 022800300046 8dc7dd98 [73366.838364] aa4ac059f570 9c370010 aa4ac059f5e0 aa4ac059f590 [73366.838458] Call Trace: [73366.838590] [] dump_stack+0x86/0xc3 [73366.838680] [] warn_alloc+0x13a/0x170 [73366.838762] [] __alloc_pages_slowpath+0x252/0xbb0 [73366.838846] [] ? finish_task_switch+0xb0/0x260 [73366.838926] [] __alloc_pages_nodemask+0x40d/0x4b0 [73366.839007] [] alloc_pages_current+0xa1/0x1f0 [73366.839088] [] ? kvm_sched_clock_read+0x25/0x40 [73366.839170] [] new_slab+0x316/0x7c0 [73366.839245] [] ___slab_alloc+0x3fb/0x5c0 [73366.839325] [] ? kvm_sched_clock_read+0x25/0x40 [73366.839409] [] ? __es_insert_extent+0xb3/0x330 [73366.839501] [] ? __es_insert_extent+0xb3/0x330 [73366.839583] [] __slab_alloc+0x51/0x90 [73366.839662] [] ? __es_insert_extent+0xb3/0x330 [73366.839743] [] kmem_cache_alloc+0x246/0x2d0 [73366.839822] [] ? __es_remove_extent+0x56/0x2d0 [73366.839906] [] __es_insert_extent+0xb3/0x330 [73366.839985] [] ext4_es_insert_extent+0xee/0x280 [73366.840067] [] ? ext4_map_blocks+0x2b4/0x5f0 [73366.840147] [] ext4_map_blocks+0x323/0x5f0 [73366.840225] [] ? workingset_refault+0x10a/0x220 [73366.840314] [] ext4_mpage_readpages+0x413/0xa60 [73366.840397] [] ? __page_cache_alloc+0x146/0x190 [73366.840487] [] ext4_readpages+0x35/0x40 [73366.840569] [] __do_page_cache_readahead+0x2bf/0x390 [73366.840651] [] ? __do_page_cache_readahead+0x16a/0x390 [73366.840735] [] filemap_fault+0x51b/0x790 [73366.840814] [] ? ext4_filemap_fault+0x2e/0x50 [73366.840896] [] ext4_filemap_fault+0x39/0x50 [73366.840976] [] __do_fault+0x83/0x1d0 [73366.841056] [] handle_mm_fault+0x11e2/0x17a0 [73366.841138] [] ? handle_mm_fault+0x5a/0x17a0 [73366.841220] [] __do_page_fault+0x266/0x520 [73366.841300] [] trace_do_page_fault+0x58/0x2a0 [73366.841382] [] do_async_page_fault+0x1a/0xa0 [73366.841464] [] async_page_fault+0x28/0x30 [73366.842500] Mem-Info: [73366.843149] active_anon:8677 inactive_anon:8798 isolated_anon:0 active_file:328 inactive_file:317 isolated_file:32 unevictable:0 dirty:0 writeback:2 unstable:0 slab_reclaimable:4968 slab_unreclaimable:9242 mapped:365 shmem:1 pagetables:2690 bounce:0 free:764 free_pcp:41 free_cma:0 [73366.846832] Node 0 active_anon:34708kB inactive_anon:35192kB active_file:1312kB inactive_file:1268kB unevictable:0kB isolated(anon):0kB isolated(file):128kB mapped:1460kB dirty:0kB writeback:8kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 4kB writeback_tmp:0kB unstable:0kB pages_scanned:32 all_unreclaimable? no [73366.848711] Node 0 DMA free:1468kB min:172kB low:212kB high:252kB active_anon:3216kB inactive_anon:3448kB active_file:40kB inactive_file:228kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:2064kB slab_unreclaimable:2960kB kernel_stack:100kB pagetables:1536kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB [73366.850769] lowmem_reserve[]: 0 116 116 116 116 [73366.851479] Node 0 DMA32 free:1588kB min:1296kB low:1620kB high:1944kB active_anon:31464kB inactive_anon:31740kB active_file:1236kB inactive_file:1056kB unevictable:0kB writepending:0kB present:180080kB managed:139012kB mlocked:0kB slab_reclaimable:17808kB slab_unreclaimable:34008kB kernel_stack:1676kB pagetables:9224kB bounce:0kB free_pcp:164kB local_pcp:12kB free_cma:0kB [73366.853757] lowmem_reserve[]: 0 0 0 0 0 [73366.854544] Node 0 DMA: 13*4kB (H) 13*8kB (H) 17*16kB (H) 12*32kB (H) 8*64kB (H) 1*128kB (H) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1452kB [73366.856200] Node 0 DMA32: 70*4kB (UMH) 12*8kB (MH) 12*16kB (H) 2*32kB (H) 5*64kB (H) 5*128kB (H) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1592kB [73366.857955] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB [73366.857956] 2401 total pagecache pages [73366.858829] 1741 pages in swap cache [73366.859721] Swap cache stats: add
Re: Still OOM problems with 4.9er kernels
On 09.12.2016 16:52, Gerhard Wiesinger wrote: On 09.12.2016 14:40, Michal Hocko wrote: On Fri 09-12-16 08:06:25, Gerhard Wiesinger wrote: Hello, same with latest kernel rc, dnf still killed with OOM (but sometimes better). ./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} (does dnf clean all;dnf update) Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Updated bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Could you post your oom report please? E.g. a new one with more than one included, first one after boot ... Just setup a low mem VM under KVM and it is easily triggerable. Still enough virtual memory available ... 4.9.0-0.rc8.git2.1.fc26.x86_64 [ 624.862777] ksoftirqd/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC) [ 624.863319] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [ 624.863410] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [ 624.863510] aa62c007f958 904774e3 90c7dd98 [ 624.863923] aa62c007f9e0 9020e6ea 020800200246 90c7dd98 [ 624.864019] aa62c007f980 96b90010 aa62c007f9f0 aa62c007f9a0 [ 624.864998] Call Trace: [ 624.865149] [] dump_stack+0x86/0xc3 [ 624.865347] [] warn_alloc+0x13a/0x170 [ 624.865432] [] __alloc_pages_slowpath+0x252/0xbb0 [ 624.865563] [] __alloc_pages_nodemask+0x40d/0x4b0 [ 624.865675] [] __alloc_page_frag+0x193/0x200 [ 624.866024] [] __napi_alloc_skb+0x8e/0xf0 [ 624.866113] [] page_to_skb.isra.28+0x5d/0x310 [virtio_net] [ 624.866201] [] virtnet_receive+0x2db/0x9a0 [virtio_net] [ 624.867378] [] virtnet_poll+0x1d/0x80 [virtio_net] [ 624.867494] [] net_rx_action+0x23e/0x470 [ 624.867612] [] __do_softirq+0xcd/0x4b9 [ 624.867704] [] ? smpboot_thread_fn+0x34/0x1f0 [ 624.867833] [] ? smpboot_thread_fn+0x12d/0x1f0 [ 624.867924] [] run_ksoftirqd+0x25/0x80 [ 624.868109] [] smpboot_thread_fn+0x128/0x1f0 [ 624.868197] [] ? sort_range+0x30/0x30 [ 624.868596] [] kthread+0x102/0x120 [ 624.868679] [] ? wait_for_completion+0x110/0x140 [ 624.868768] [] ? kthread_park+0x60/0x60 [ 624.868850] [] ret_from_fork+0x2a/0x40 [ 843.528656] httpd (2490) used greatest stack depth: 10304 bytes left [ 878.077750] httpd (2976) used greatest stack depth: 10096 bytes left [93918.861109] netstat (14579) used greatest stack depth: 9488 bytes left [94050.874669] kworker/dying (6253) used greatest stack depth: 9008 bytes left [95895.765570] kworker/1:1H: page allocation failure: order:0, mode:0x2280020(GFP_ATOMIC|__GFP_NOTRACK) [95895.765819] CPU: 1 PID: 440 Comm: kworker/1:1H Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [95895.765911] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [95895.766060] Workqueue: kblockd blk_mq_run_work_fn [95895.766143] aa62c0257628 904774e3 90c7dd98 [95895.766235] aa62c02576b0 9020e6ea 022800200046 90c7dd98 [95895.766325] aa62c0257650 96b90010 aa62c02576c0 aa62c0257670 [95895.766417] Call Trace: [95895.766502] [] dump_stack+0x86/0xc3 [95895.766596] [] warn_alloc+0x13a/0x170 [95895.766681] [] __alloc_pages_slowpath+0x252/0xbb0 [95895.766767] [] __alloc_pages_nodemask+0x40d/0x4b0 [95895.766866] [] alloc_pages_current+0xa1/0x1f0 [95895.766971] [] ? _raw_spin_unlock+0x27/0x40 [95895.767073] [] new_slab+0x316/0x7c0 [95895.767160] [] ___slab_alloc+0x3fb/0x5c0 [95895.772611] [] ? cpuacct_charge+0xf2/0x1f0 [95895.773406] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.774327] [] ? rcu_read_lock_sched_held+0x45/0x80 [95895.775212] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.776155] [] __slab_alloc+0x51/0x90 [95895.777090] [] __kmalloc+0x251/0x320 [95895.781502] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.782309] [] alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.783334] [] virtqueue_add_sgs+0x1c3/0x4a0 [virtio_ring] [95895.784059] [] ? kvm_sched_clock_read+0x25/0x40 [95895.784742] [] __virtblk_add_req+0xbc/0x220 [virtio_blk] [95895.785419] [] ? debug_lockdep_rcu_enabled+0x1d/0x20 [95895.786086] [] ? virtio_queue_rq+0x105/0x290 [virtio_blk] [95895.786750] [] virtio_queue_rq+0x12d/0x290 [virtio_blk] [95895.787427] [] __blk_mq_run_hw_queue+0x26d/0x3b0 [95895.788106] [] blk_mq_run_work_fn+0x12/0x20 [95895.789065] [] process_one_work+0x23e/0x6f0 [95895.789741] [] ? process_one_work+0x1ba/0x6f0 [95895.790444] [] worker_thread+0x4e/0x490 [95895.791178] [] ? process_one_work+0x6f0/0x6f0 [95895.791911] [] ? process_one_work+0x6f0/0x6f0 [95895.792653] [] ? do_syscall_64+0x6c/0x1f0 [95895.793397] [] kthread+0x102/0x120 [95895.794212] [] ? trace_hardirqs_on_caller+0xf5/0x1b0 [95895.794942] [] ? kthread_park+0x60/0x60 [95895.795689] [] ret_from_fork+0x2a/0x40
Re: Still OOM problems with 4.9er kernels
On 09.12.2016 16:52, Gerhard Wiesinger wrote: On 09.12.2016 14:40, Michal Hocko wrote: On Fri 09-12-16 08:06:25, Gerhard Wiesinger wrote: Hello, same with latest kernel rc, dnf still killed with OOM (but sometimes better). ./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} (does dnf clean all;dnf update) Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Updated bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Could you post your oom report please? E.g. a new one with more than one included, first one after boot ... Just setup a low mem VM under KVM and it is easily triggerable. Still enough virtual memory available ... 4.9.0-0.rc8.git2.1.fc26.x86_64 [ 624.862777] ksoftirqd/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC) [ 624.863319] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [ 624.863410] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [ 624.863510] aa62c007f958 904774e3 90c7dd98 [ 624.863923] aa62c007f9e0 9020e6ea 020800200246 90c7dd98 [ 624.864019] aa62c007f980 96b90010 aa62c007f9f0 aa62c007f9a0 [ 624.864998] Call Trace: [ 624.865149] [] dump_stack+0x86/0xc3 [ 624.865347] [] warn_alloc+0x13a/0x170 [ 624.865432] [] __alloc_pages_slowpath+0x252/0xbb0 [ 624.865563] [] __alloc_pages_nodemask+0x40d/0x4b0 [ 624.865675] [] __alloc_page_frag+0x193/0x200 [ 624.866024] [] __napi_alloc_skb+0x8e/0xf0 [ 624.866113] [] page_to_skb.isra.28+0x5d/0x310 [virtio_net] [ 624.866201] [] virtnet_receive+0x2db/0x9a0 [virtio_net] [ 624.867378] [] virtnet_poll+0x1d/0x80 [virtio_net] [ 624.867494] [] net_rx_action+0x23e/0x470 [ 624.867612] [] __do_softirq+0xcd/0x4b9 [ 624.867704] [] ? smpboot_thread_fn+0x34/0x1f0 [ 624.867833] [] ? smpboot_thread_fn+0x12d/0x1f0 [ 624.867924] [] run_ksoftirqd+0x25/0x80 [ 624.868109] [] smpboot_thread_fn+0x128/0x1f0 [ 624.868197] [] ? sort_range+0x30/0x30 [ 624.868596] [] kthread+0x102/0x120 [ 624.868679] [] ? wait_for_completion+0x110/0x140 [ 624.868768] [] ? kthread_park+0x60/0x60 [ 624.868850] [] ret_from_fork+0x2a/0x40 [ 843.528656] httpd (2490) used greatest stack depth: 10304 bytes left [ 878.077750] httpd (2976) used greatest stack depth: 10096 bytes left [93918.861109] netstat (14579) used greatest stack depth: 9488 bytes left [94050.874669] kworker/dying (6253) used greatest stack depth: 9008 bytes left [95895.765570] kworker/1:1H: page allocation failure: order:0, mode:0x2280020(GFP_ATOMIC|__GFP_NOTRACK) [95895.765819] CPU: 1 PID: 440 Comm: kworker/1:1H Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [95895.765911] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [95895.766060] Workqueue: kblockd blk_mq_run_work_fn [95895.766143] aa62c0257628 904774e3 90c7dd98 [95895.766235] aa62c02576b0 9020e6ea 022800200046 90c7dd98 [95895.766325] aa62c0257650 96b90010 aa62c02576c0 aa62c0257670 [95895.766417] Call Trace: [95895.766502] [] dump_stack+0x86/0xc3 [95895.766596] [] warn_alloc+0x13a/0x170 [95895.766681] [] __alloc_pages_slowpath+0x252/0xbb0 [95895.766767] [] __alloc_pages_nodemask+0x40d/0x4b0 [95895.766866] [] alloc_pages_current+0xa1/0x1f0 [95895.766971] [] ? _raw_spin_unlock+0x27/0x40 [95895.767073] [] new_slab+0x316/0x7c0 [95895.767160] [] ___slab_alloc+0x3fb/0x5c0 [95895.772611] [] ? cpuacct_charge+0xf2/0x1f0 [95895.773406] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.774327] [] ? rcu_read_lock_sched_held+0x45/0x80 [95895.775212] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.776155] [] __slab_alloc+0x51/0x90 [95895.777090] [] __kmalloc+0x251/0x320 [95895.781502] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.782309] [] alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.783334] [] virtqueue_add_sgs+0x1c3/0x4a0 [virtio_ring] [95895.784059] [] ? kvm_sched_clock_read+0x25/0x40 [95895.784742] [] __virtblk_add_req+0xbc/0x220 [virtio_blk] [95895.785419] [] ? debug_lockdep_rcu_enabled+0x1d/0x20 [95895.786086] [] ? virtio_queue_rq+0x105/0x290 [virtio_blk] [95895.786750] [] virtio_queue_rq+0x12d/0x290 [virtio_blk] [95895.787427] [] __blk_mq_run_hw_queue+0x26d/0x3b0 [95895.788106] [] blk_mq_run_work_fn+0x12/0x20 [95895.789065] [] process_one_work+0x23e/0x6f0 [95895.789741] [] ? process_one_work+0x1ba/0x6f0 [95895.790444] [] worker_thread+0x4e/0x490 [95895.791178] [] ? process_one_work+0x6f0/0x6f0 [95895.791911] [] ? process_one_work+0x6f0/0x6f0 [95895.792653] [] ? do_syscall_64+0x6c/0x1f0 [95895.793397] [] kthread+0x102/0x120 [95895.794212] [] ? trace_hardirqs_on_caller+0xf5/0x1b0 [95895.794942] [] ? kthread_park+0x60/0x60 [95895.795689] [] ret_from_fork+0x2a/0x40
Re: Still OOM problems with 4.9er kernels
On 09.12.2016 14:40, Michal Hocko wrote: On Fri 09-12-16 08:06:25, Gerhard Wiesinger wrote: Hello, same with latest kernel rc, dnf still killed with OOM (but sometimes better). ./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} (does dnf clean all;dnf update) Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Updated bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Could you post your oom report please? E.g. a new one with more than one included, first one after boot ... Just setup a low mem VM under KVM and it is easily triggerable. Still enough virtual memory available ... 4.9.0-0.rc8.git2.1.fc26.x86_64 [ 624.862777] ksoftirqd/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC) [ 624.863319] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [ 624.863410] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [ 624.863510] aa62c007f958 904774e3 90c7dd98 [ 624.863923] aa62c007f9e0 9020e6ea 020800200246 90c7dd98 [ 624.864019] aa62c007f980 96b90010 aa62c007f9f0 aa62c007f9a0 [ 624.864998] Call Trace: [ 624.865149] [] dump_stack+0x86/0xc3 [ 624.865347] [] warn_alloc+0x13a/0x170 [ 624.865432] [] __alloc_pages_slowpath+0x252/0xbb0 [ 624.865563] [] __alloc_pages_nodemask+0x40d/0x4b0 [ 624.865675] [] __alloc_page_frag+0x193/0x200 [ 624.866024] [] __napi_alloc_skb+0x8e/0xf0 [ 624.866113] [] page_to_skb.isra.28+0x5d/0x310 [virtio_net] [ 624.866201] [] virtnet_receive+0x2db/0x9a0 [virtio_net] [ 624.867378] [] virtnet_poll+0x1d/0x80 [virtio_net] [ 624.867494] [] net_rx_action+0x23e/0x470 [ 624.867612] [] __do_softirq+0xcd/0x4b9 [ 624.867704] [] ? smpboot_thread_fn+0x34/0x1f0 [ 624.867833] [] ? smpboot_thread_fn+0x12d/0x1f0 [ 624.867924] [] run_ksoftirqd+0x25/0x80 [ 624.868109] [] smpboot_thread_fn+0x128/0x1f0 [ 624.868197] [] ? sort_range+0x30/0x30 [ 624.868596] [] kthread+0x102/0x120 [ 624.868679] [] ? wait_for_completion+0x110/0x140 [ 624.868768] [] ? kthread_park+0x60/0x60 [ 624.868850] [] ret_from_fork+0x2a/0x40 [ 843.528656] httpd (2490) used greatest stack depth: 10304 bytes left [ 878.077750] httpd (2976) used greatest stack depth: 10096 bytes left [93918.861109] netstat (14579) used greatest stack depth: 9488 bytes left [94050.874669] kworker/dying (6253) used greatest stack depth: 9008 bytes left [95895.765570] kworker/1:1H: page allocation failure: order:0, mode:0x2280020(GFP_ATOMIC|__GFP_NOTRACK) [95895.765819] CPU: 1 PID: 440 Comm: kworker/1:1H Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [95895.765911] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [95895.766060] Workqueue: kblockd blk_mq_run_work_fn [95895.766143] aa62c0257628 904774e3 90c7dd98 [95895.766235] aa62c02576b0 9020e6ea 022800200046 90c7dd98 [95895.766325] aa62c0257650 96b90010 aa62c02576c0 aa62c0257670 [95895.766417] Call Trace: [95895.766502] [] dump_stack+0x86/0xc3 [95895.766596] [] warn_alloc+0x13a/0x170 [95895.766681] [] __alloc_pages_slowpath+0x252/0xbb0 [95895.766767] [] __alloc_pages_nodemask+0x40d/0x4b0 [95895.766866] [] alloc_pages_current+0xa1/0x1f0 [95895.766971] [] ? _raw_spin_unlock+0x27/0x40 [95895.767073] [] new_slab+0x316/0x7c0 [95895.767160] [] ___slab_alloc+0x3fb/0x5c0 [95895.772611] [] ? cpuacct_charge+0xf2/0x1f0 [95895.773406] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.774327] [] ? rcu_read_lock_sched_held+0x45/0x80 [95895.775212] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.776155] [] __slab_alloc+0x51/0x90 [95895.777090] [] __kmalloc+0x251/0x320 [95895.781502] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.782309] [] alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.783334] [] virtqueue_add_sgs+0x1c3/0x4a0 [virtio_ring] [95895.784059] [] ? kvm_sched_clock_read+0x25/0x40 [95895.784742] [] __virtblk_add_req+0xbc/0x220 [virtio_blk] [95895.785419] [] ? debug_lockdep_rcu_enabled+0x1d/0x20 [95895.786086] [] ? virtio_queue_rq+0x105/0x290 [virtio_blk] [95895.786750] [] virtio_queue_rq+0x12d/0x290 [virtio_blk] [95895.787427] [] __blk_mq_run_hw_queue+0x26d/0x3b0 [95895.788106] [] blk_mq_run_work_fn+0x12/0x20 [95895.789065] [] process_one_work+0x23e/0x6f0 [95895.789741] [] ? process_one_work+0x1ba/0x6f0 [95895.790444] [] worker_thread+0x4e/0x490 [95895.791178] [] ? process_one_work+0x6f0/0x6f0 [95895.791911] [] ? process_one_work+0x6f0/0x6f0 [95895.792653] [] ? do_syscall_64+0x6c/0x1f0 [95895.793397] [] kthread+0x102/0x120 [95895.794212] [] ? trace_hardirqs_on_caller+0xf5/0x1b0 [95895.794942] [] ? kthread_park+0x60/0x60 [95895.795689] [] ret_from_fork+0x2a/0x40 [95895.796408] Mem-Info: [95895.797110] active_anon:8800
Re: Still OOM problems with 4.9er kernels
On 09.12.2016 14:40, Michal Hocko wrote: On Fri 09-12-16 08:06:25, Gerhard Wiesinger wrote: Hello, same with latest kernel rc, dnf still killed with OOM (but sometimes better). ./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} (does dnf clean all;dnf update) Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Updated bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Could you post your oom report please? E.g. a new one with more than one included, first one after boot ... Just setup a low mem VM under KVM and it is easily triggerable. Still enough virtual memory available ... 4.9.0-0.rc8.git2.1.fc26.x86_64 [ 624.862777] ksoftirqd/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC) [ 624.863319] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [ 624.863410] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [ 624.863510] aa62c007f958 904774e3 90c7dd98 [ 624.863923] aa62c007f9e0 9020e6ea 020800200246 90c7dd98 [ 624.864019] aa62c007f980 96b90010 aa62c007f9f0 aa62c007f9a0 [ 624.864998] Call Trace: [ 624.865149] [] dump_stack+0x86/0xc3 [ 624.865347] [] warn_alloc+0x13a/0x170 [ 624.865432] [] __alloc_pages_slowpath+0x252/0xbb0 [ 624.865563] [] __alloc_pages_nodemask+0x40d/0x4b0 [ 624.865675] [] __alloc_page_frag+0x193/0x200 [ 624.866024] [] __napi_alloc_skb+0x8e/0xf0 [ 624.866113] [] page_to_skb.isra.28+0x5d/0x310 [virtio_net] [ 624.866201] [] virtnet_receive+0x2db/0x9a0 [virtio_net] [ 624.867378] [] virtnet_poll+0x1d/0x80 [virtio_net] [ 624.867494] [] net_rx_action+0x23e/0x470 [ 624.867612] [] __do_softirq+0xcd/0x4b9 [ 624.867704] [] ? smpboot_thread_fn+0x34/0x1f0 [ 624.867833] [] ? smpboot_thread_fn+0x12d/0x1f0 [ 624.867924] [] run_ksoftirqd+0x25/0x80 [ 624.868109] [] smpboot_thread_fn+0x128/0x1f0 [ 624.868197] [] ? sort_range+0x30/0x30 [ 624.868596] [] kthread+0x102/0x120 [ 624.868679] [] ? wait_for_completion+0x110/0x140 [ 624.868768] [] ? kthread_park+0x60/0x60 [ 624.868850] [] ret_from_fork+0x2a/0x40 [ 843.528656] httpd (2490) used greatest stack depth: 10304 bytes left [ 878.077750] httpd (2976) used greatest stack depth: 10096 bytes left [93918.861109] netstat (14579) used greatest stack depth: 9488 bytes left [94050.874669] kworker/dying (6253) used greatest stack depth: 9008 bytes left [95895.765570] kworker/1:1H: page allocation failure: order:0, mode:0x2280020(GFP_ATOMIC|__GFP_NOTRACK) [95895.765819] CPU: 1 PID: 440 Comm: kworker/1:1H Not tainted 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 [95895.765911] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3 [95895.766060] Workqueue: kblockd blk_mq_run_work_fn [95895.766143] aa62c0257628 904774e3 90c7dd98 [95895.766235] aa62c02576b0 9020e6ea 022800200046 90c7dd98 [95895.766325] aa62c0257650 96b90010 aa62c02576c0 aa62c0257670 [95895.766417] Call Trace: [95895.766502] [] dump_stack+0x86/0xc3 [95895.766596] [] warn_alloc+0x13a/0x170 [95895.766681] [] __alloc_pages_slowpath+0x252/0xbb0 [95895.766767] [] __alloc_pages_nodemask+0x40d/0x4b0 [95895.766866] [] alloc_pages_current+0xa1/0x1f0 [95895.766971] [] ? _raw_spin_unlock+0x27/0x40 [95895.767073] [] new_slab+0x316/0x7c0 [95895.767160] [] ___slab_alloc+0x3fb/0x5c0 [95895.772611] [] ? cpuacct_charge+0xf2/0x1f0 [95895.773406] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.774327] [] ? rcu_read_lock_sched_held+0x45/0x80 [95895.775212] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.776155] [] __slab_alloc+0x51/0x90 [95895.777090] [] __kmalloc+0x251/0x320 [95895.781502] [] ? alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.782309] [] alloc_indirect.isra.11+0x1d/0x50 [virtio_ring] [95895.783334] [] virtqueue_add_sgs+0x1c3/0x4a0 [virtio_ring] [95895.784059] [] ? kvm_sched_clock_read+0x25/0x40 [95895.784742] [] __virtblk_add_req+0xbc/0x220 [virtio_blk] [95895.785419] [] ? debug_lockdep_rcu_enabled+0x1d/0x20 [95895.786086] [] ? virtio_queue_rq+0x105/0x290 [virtio_blk] [95895.786750] [] virtio_queue_rq+0x12d/0x290 [virtio_blk] [95895.787427] [] __blk_mq_run_hw_queue+0x26d/0x3b0 [95895.788106] [] blk_mq_run_work_fn+0x12/0x20 [95895.789065] [] process_one_work+0x23e/0x6f0 [95895.789741] [] ? process_one_work+0x1ba/0x6f0 [95895.790444] [] worker_thread+0x4e/0x490 [95895.791178] [] ? process_one_work+0x6f0/0x6f0 [95895.791911] [] ? process_one_work+0x6f0/0x6f0 [95895.792653] [] ? do_syscall_64+0x6c/0x1f0 [95895.793397] [] kthread+0x102/0x120 [95895.794212] [] ? trace_hardirqs_on_caller+0xf5/0x1b0 [95895.794942] [] ? kthread_park+0x60/0x60 [95895.795689] [] ret_from_fork+0x2a/0x40 [95895.796408] Mem-Info: [95895.797110] active_anon:8800
Re: Still OOM problems with 4.9er kernels
Hello, same with latest kernel rc, dnf still killed with OOM (but sometimes better). ./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} (does dnf clean all;dnf update) Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Updated bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Any chance to get it fixed in 4.9.0 release? Ciao, Gerhard On 30.11.2016 08:20, Gerhard Wiesinger wrote: Hello, See also: Bug 1314697 - Kernel 4.4.3-300.fc23.x86_64 is not stable inside a KVM VM https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Ciao, Gerhard On 30.11.2016 08:10, Gerhard Wiesinger wrote: Hello, I'm having out of memory situations with my "low memory" VMs in KVM under Fedora (Kernel 4.7, 4.8 and also before). They started to get more and more sensitive to OOM. I recently found the following info: https://marius.bloggt-in-braunschweig.de/2016/11/17/linuxkernel-4-74-8-und-der-oom-killer/ https://www.spinics.net/lists/linux-mm/msg113661.html Therefore I tried the latest Fedora kernels: 4.9.0-0.rc6.git2.1.fc26.x86_64 But OOM situation is still very easy to reproduce: 1.) VM with 128-384MB under Fedora 25 2.) Having some processes run without any load (e.g. Apache) 3.) run an update with: dnf clean all; dnf update 4.) dnf python process get's killed Please make the VM system working again in Kernel 4.9 and to use swap again correctly. Thnx. Ciao, Gerhard
Re: Still OOM problems with 4.9er kernels
Hello, same with latest kernel rc, dnf still killed with OOM (but sometimes better). ./update.sh: line 40: 1591 Killed ${EXE} update ${PARAMS} (does dnf clean all;dnf update) Linux database.intern 4.9.0-0.rc8.git2.1.fc26.x86_64 #1 SMP Wed Dec 7 17:53:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Updated bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Any chance to get it fixed in 4.9.0 release? Ciao, Gerhard On 30.11.2016 08:20, Gerhard Wiesinger wrote: Hello, See also: Bug 1314697 - Kernel 4.4.3-300.fc23.x86_64 is not stable inside a KVM VM https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Ciao, Gerhard On 30.11.2016 08:10, Gerhard Wiesinger wrote: Hello, I'm having out of memory situations with my "low memory" VMs in KVM under Fedora (Kernel 4.7, 4.8 and also before). They started to get more and more sensitive to OOM. I recently found the following info: https://marius.bloggt-in-braunschweig.de/2016/11/17/linuxkernel-4-74-8-und-der-oom-killer/ https://www.spinics.net/lists/linux-mm/msg113661.html Therefore I tried the latest Fedora kernels: 4.9.0-0.rc6.git2.1.fc26.x86_64 But OOM situation is still very easy to reproduce: 1.) VM with 128-384MB under Fedora 25 2.) Having some processes run without any load (e.g. Apache) 3.) run an update with: dnf clean all; dnf update 4.) dnf python process get's killed Please make the VM system working again in Kernel 4.9 and to use swap again correctly. Thnx. Ciao, Gerhard
Another kernel OOPS still not fixed
Hello, There is another major kernel OOPS which is still not fixed over a year Bug 1279188 - bind-chroot causes kernel to crash on restart (mount with bind option): https://bugzilla.redhat.com/show_bug.cgi?id=1279188 Can you please fix it. Thnx. Ciao, Gerhard
Another kernel OOPS still not fixed
Hello, There is another major kernel OOPS which is still not fixed over a year Bug 1279188 - bind-chroot causes kernel to crash on restart (mount with bind option): https://bugzilla.redhat.com/show_bug.cgi?id=1279188 Can you please fix it. Thnx. Ciao, Gerhard
Still OOM problems with 4.9er kernels
Hello, I'm having out of memory situations with my "low memory" VMs in KVM under Fedora (Kernel 4.7, 4.8 and also before). They started to get more and more sensitive to OOM. I recently found the following info: https://marius.bloggt-in-braunschweig.de/2016/11/17/linuxkernel-4-74-8-und-der-oom-killer/ https://www.spinics.net/lists/linux-mm/msg113661.html Therefore I tried the latest Fedora kernels: 4.9.0-0.rc6.git2.1.fc26.x86_64 But OOM situation is still very easy to reproduce: 1.) VM with 128-384MB under Fedora 25 2.) Having some processes run without any load (e.g. Apache) 3.) run an update with: dnf clean all; dnf update 4.) dnf python process get's killed Please make the VM system working again in Kernel 4.9 and to use swap again correctly. Thnx. Ciao, Gerhard
Still OOM problems with 4.9er kernels
Hello, I'm having out of memory situations with my "low memory" VMs in KVM under Fedora (Kernel 4.7, 4.8 and also before). They started to get more and more sensitive to OOM. I recently found the following info: https://marius.bloggt-in-braunschweig.de/2016/11/17/linuxkernel-4-74-8-und-der-oom-killer/ https://www.spinics.net/lists/linux-mm/msg113661.html Therefore I tried the latest Fedora kernels: 4.9.0-0.rc6.git2.1.fc26.x86_64 But OOM situation is still very easy to reproduce: 1.) VM with 128-384MB under Fedora 25 2.) Having some processes run without any load (e.g. Apache) 3.) run an update with: dnf clean all; dnf update 4.) dnf python process get's killed Please make the VM system working again in Kernel 4.9 and to use swap again correctly. Thnx. Ciao, Gerhard
Re: Still OOM problems with 4.9er kernels
Hello, See also: Bug 1314697 - Kernel 4.4.3-300.fc23.x86_64 is not stable inside a KVM VM https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Ciao, Gerhard On 30.11.2016 08:10, Gerhard Wiesinger wrote: Hello, I'm having out of memory situations with my "low memory" VMs in KVM under Fedora (Kernel 4.7, 4.8 and also before). They started to get more and more sensitive to OOM. I recently found the following info: https://marius.bloggt-in-braunschweig.de/2016/11/17/linuxkernel-4-74-8-und-der-oom-killer/ https://www.spinics.net/lists/linux-mm/msg113661.html Therefore I tried the latest Fedora kernels: 4.9.0-0.rc6.git2.1.fc26.x86_64 But OOM situation is still very easy to reproduce: 1.) VM with 128-384MB under Fedora 25 2.) Having some processes run without any load (e.g. Apache) 3.) run an update with: dnf clean all; dnf update 4.) dnf python process get's killed Please make the VM system working again in Kernel 4.9 and to use swap again correctly. Thnx. Ciao, Gerhard
Re: Still OOM problems with 4.9er kernels
Hello, See also: Bug 1314697 - Kernel 4.4.3-300.fc23.x86_64 is not stable inside a KVM VM https://bugzilla.redhat.com/show_bug.cgi?id=1314697 Ciao, Gerhard On 30.11.2016 08:10, Gerhard Wiesinger wrote: Hello, I'm having out of memory situations with my "low memory" VMs in KVM under Fedora (Kernel 4.7, 4.8 and also before). They started to get more and more sensitive to OOM. I recently found the following info: https://marius.bloggt-in-braunschweig.de/2016/11/17/linuxkernel-4-74-8-und-der-oom-killer/ https://www.spinics.net/lists/linux-mm/msg113661.html Therefore I tried the latest Fedora kernels: 4.9.0-0.rc6.git2.1.fc26.x86_64 But OOM situation is still very easy to reproduce: 1.) VM with 128-384MB under Fedora 25 2.) Having some processes run without any load (e.g. Apache) 3.) run an update with: dnf clean all; dnf update 4.) dnf python process get's killed Please make the VM system working again in Kernel 4.9 and to use swap again correctly. Thnx. Ciao, Gerhard
Re: Linux 4.2.4
On 08.11.2015 18:20, Greg KH wrote: On Sun, Nov 08, 2015 at 02:51:01PM +0100, Gerhard Wiesinger wrote: On 25.10.2015 17:29, Greg KH wrote: On Sun, Oct 25, 2015 at 11:48:54AM +0100, Gerhard Wiesinger wrote: On 25.10.2015 10:46, Willy Tarreau wrote: ipset *triggered* the problem. The whole stack dump would tell more. OK, find the stack traces in the bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1272645 Kernel 4.1.10 triggered also a kernel dump when playing with ipset commands and IPv6, details in the bug report Kernel 4.2 seems to me not well tested in the netfilter parts at all (Bug with already known bugfix https://lists.debian.org/debian-kernel/2015/10/msg00034.html was triggered on 2 of 3 of my machines, the new bug on 1 of 1 tested machine). There's a reason why Greg maintains stable and LTS kernels :-) Stable kernels don't crash but definiton. :-) At least triggered 2 kernel panics in 5min, even with 4.1.10 and ipset commands ... Does this happen also with Linus's tree? I suggest you ask the networking developers about this on net...@vger.kernel.org, there's nothing that I can do on my own about this, sorry. Patch is now available, see: [PATCH 0/3] ipset patches for nf https://marc.info/?l=netfilter-devel=144690007708041=2 https://marc.info/?l=netfilter-devel=144690007808042=2 https://marc.info/?l=netfilter-devel=144690008608043=2 https://marc.info/?l=netfilter-devel=144690007708039=2 [ANNOUNCE] ipset 6.27 released https://marc.info/?l=netfilter-devel=144690048308099=2 Requires also new userland ipset version. Please integrate it upstream. Thanx to Jozsef Kadlecsik for fixing it. That's great, can you let me know the git commits that end up in Linus's tree? That's what we need for the stable kernel. Find the commits here: https://git.kernel.org/cgit/linux/kernel/git/pablo/nf.git/ https://git.kernel.org/cgit/linux/kernel/git/pablo/nf.git/commit/?id=e75cb467df29a428612c162e6f1451c5c0717091 Don't know exactly the merging processes, so feel free to merge or contact Pablo. Ciao, Gerhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 4.2.4
On 08.11.2015 18:20, Greg KH wrote: On Sun, Nov 08, 2015 at 02:51:01PM +0100, Gerhard Wiesinger wrote: On 25.10.2015 17:29, Greg KH wrote: On Sun, Oct 25, 2015 at 11:48:54AM +0100, Gerhard Wiesinger wrote: On 25.10.2015 10:46, Willy Tarreau wrote: ipset *triggered* the problem. The whole stack dump would tell more. OK, find the stack traces in the bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1272645 Kernel 4.1.10 triggered also a kernel dump when playing with ipset commands and IPv6, details in the bug report Kernel 4.2 seems to me not well tested in the netfilter parts at all (Bug with already known bugfix https://lists.debian.org/debian-kernel/2015/10/msg00034.html was triggered on 2 of 3 of my machines, the new bug on 1 of 1 tested machine). There's a reason why Greg maintains stable and LTS kernels :-) Stable kernels don't crash but definiton. :-) At least triggered 2 kernel panics in 5min, even with 4.1.10 and ipset commands ... Does this happen also with Linus's tree? I suggest you ask the networking developers about this on net...@vger.kernel.org, there's nothing that I can do on my own about this, sorry. Patch is now available, see: [PATCH 0/3] ipset patches for nf https://marc.info/?l=netfilter-devel=144690007708041=2 https://marc.info/?l=netfilter-devel=144690007808042=2 https://marc.info/?l=netfilter-devel=144690008608043=2 https://marc.info/?l=netfilter-devel=144690007708039=2 [ANNOUNCE] ipset 6.27 released https://marc.info/?l=netfilter-devel=144690048308099=2 Requires also new userland ipset version. Please integrate it upstream. Thanx to Jozsef Kadlecsik for fixing it. That's great, can you let me know the git commits that end up in Linus's tree? That's what we need for the stable kernel. Find the commits here: https://git.kernel.org/cgit/linux/kernel/git/pablo/nf.git/ https://git.kernel.org/cgit/linux/kernel/git/pablo/nf.git/commit/?id=e75cb467df29a428612c162e6f1451c5c0717091 Don't know exactly the merging processes, so feel free to merge or contact Pablo. Ciao, Gerhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 4.2.4
On 25.10.2015 17:29, Greg KH wrote: On Sun, Oct 25, 2015 at 11:48:54AM +0100, Gerhard Wiesinger wrote: On 25.10.2015 10:46, Willy Tarreau wrote: ipset *triggered* the problem. The whole stack dump would tell more. OK, find the stack traces in the bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1272645 Kernel 4.1.10 triggered also a kernel dump when playing with ipset commands and IPv6, details in the bug report Kernel 4.2 seems to me not well tested in the netfilter parts at all (Bug with already known bugfix https://lists.debian.org/debian-kernel/2015/10/msg00034.html was triggered on 2 of 3 of my machines, the new bug on 1 of 1 tested machine). There's a reason why Greg maintains stable and LTS kernels :-) Stable kernels don't crash but definiton. :-) At least triggered 2 kernel panics in 5min, even with 4.1.10 and ipset commands ... Does this happen also with Linus's tree? I suggest you ask the networking developers about this on net...@vger.kernel.org, there's nothing that I can do on my own about this, sorry. Patch is now available, see: [PATCH 0/3] ipset patches for nf https://marc.info/?l=netfilter-devel=144690007708041=2 https://marc.info/?l=netfilter-devel=144690007808042=2 https://marc.info/?l=netfilter-devel=144690008608043=2 https://marc.info/?l=netfilter-devel=144690007708039=2 [ANNOUNCE] ipset 6.27 released https://marc.info/?l=netfilter-devel=144690048308099=2 Requires also new userland ipset version. Please integrate it upstream. Thanx to Jozsef Kadlecsik for fixing it. Ciao, Gerhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 4.2.4
On 25.10.2015 17:29, Greg KH wrote: On Sun, Oct 25, 2015 at 11:48:54AM +0100, Gerhard Wiesinger wrote: On 25.10.2015 10:46, Willy Tarreau wrote: ipset *triggered* the problem. The whole stack dump would tell more. OK, find the stack traces in the bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1272645 Kernel 4.1.10 triggered also a kernel dump when playing with ipset commands and IPv6, details in the bug report Kernel 4.2 seems to me not well tested in the netfilter parts at all (Bug with already known bugfix https://lists.debian.org/debian-kernel/2015/10/msg00034.html was triggered on 2 of 3 of my machines, the new bug on 1 of 1 tested machine). There's a reason why Greg maintains stable and LTS kernels :-) Stable kernels don't crash but definiton. :-) At least triggered 2 kernel panics in 5min, even with 4.1.10 and ipset commands ... Does this happen also with Linus's tree? I suggest you ask the networking developers about this on net...@vger.kernel.org, there's nothing that I can do on my own about this, sorry. Patch is now available, see: [PATCH 0/3] ipset patches for nf https://marc.info/?l=netfilter-devel=144690007708041=2 https://marc.info/?l=netfilter-devel=144690007808042=2 https://marc.info/?l=netfilter-devel=144690008608043=2 https://marc.info/?l=netfilter-devel=144690007708039=2 [ANNOUNCE] ipset 6.27 released https://marc.info/?l=netfilter-devel=144690048308099=2 Requires also new userland ipset version. Please integrate it upstream. Thanx to Jozsef Kadlecsik for fixing it. Ciao, Gerhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 4.2.4
On 26.10.2015 09:58, Jozsef Kadlecsik wrote: On Sun, 25 Oct 2015, Gerhard Wiesinger wrote: Also any idea regarding the second isssue? Or do you think it has the same root cause? Looking at your RedHat bugzilla report, the "nf_conntrack: table full, dropping packet" and "Alignment trap: not handling instruction" are two unrelated issues and the second one is triggered by the unaligned counter extension acccess in ipset, I'm investigating. I can't think of any reason how those issues could be related to each other. Yes, they are unrelated. Issue 1: nf_conntrack: table full, dropping packet => Fixed with 4.2.4 Issue 2: Alignment trap: not handling instruction => Happens when ipset counters are enabled Please keep in mind it happens with IPv6 commands. Currently 4.2.4 without ipset counters runs well. Ciao, Gerhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 4.2.4
On 25.10.2015 22:53, Jozsef Kadlecsik wrote: On Sun, 25 Oct 2015, Gerhard Wiesinger wrote: Any further ideas? Does it crash without counters? That could narrow down where to look for. Hello Jozsef, it doesn't crash i I don't use the counters so far. So there must be a bug with the counters. Any idea for the root cause? Thnx. Ciao, Gerhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 4.2.4
On 25.10.2015 22:53, Jozsef Kadlecsik wrote: On Sun, 25 Oct 2015, Gerhard Wiesinger wrote: Any further ideas? Does it crash without counters? That could narrow down where to look for. Hello Jozsef, it doesn't crash i I don't use the counters so far. So there must be a bug with the counters. Any idea for the root cause? Thnx. Ciao, Gerhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 4.2.4
On 26.10.2015 09:58, Jozsef Kadlecsik wrote: On Sun, 25 Oct 2015, Gerhard Wiesinger wrote: Also any idea regarding the second isssue? Or do you think it has the same root cause? Looking at your RedHat bugzilla report, the "nf_conntrack: table full, dropping packet" and "Alignment trap: not handling instruction" are two unrelated issues and the second one is triggered by the unaligned counter extension acccess in ipset, I'm investigating. I can't think of any reason how those issues could be related to each other. Yes, they are unrelated. Issue 1: nf_conntrack: table full, dropping packet => Fixed with 4.2.4 Issue 2: Alignment trap: not handling instruction => Happens when ipset counters are enabled Please keep in mind it happens with IPv6 commands. Currently 4.2.4 without ipset counters runs well. Ciao, Gerhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 4.2.4
On 25.10.2015 21:08, Gerhard Wiesinger wrote: On 25.10.2015 20:46, Jozsef Kadlecsik wrote: Hi, On Sun, 25 Oct 2015, Gerhard Wiesinger wrote: On 25.10.2015 10:46, Willy Tarreau wrote: ipset *triggered* the problem. The whole stack dump would tell more. OK, find the stack traces in the bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1272645 Kernel 4.1.10 triggered also a kernel dump when playing with ipset commands and IPv6, details in the bug report It seems to me it is an architecture-specific alignment issue. I don't have a Cortex-A7 ARM hardware and qemu doesn't seem to support it either, so I'm unable to reproduce it (ipset passes all my tests on my hardware, including more complex ones than what breaks here). My first wild guess is that the dynamic array of the element structure is not aligned properly. Could you give a try to the next patch? diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index afe905c..1cf357d 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -1211,6 +1211,9 @@ static const struct ip_set_type_variant mtype_variant = { .same_set = mtype_same_set, }; +#define IP_SET_BASE_ALIGN(dtype)\ +ALIGN(sizeof(struct dtype), __alignof__(struct dtype)) + #ifdef IP_SET_EMIT_CREATE static int IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set, @@ -1319,12 +1322,12 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set, #endif set->variant = _TOKEN(HTYPE, 4_variant); set->dsize = ip_set_elem_len(set, tb, -sizeof(struct IPSET_TOKEN(HTYPE, 4_elem))); +IP_SET_BASE_ALIGN(IPSET_TOKEN(HTYPE, 4_elem))); #ifndef IP_SET_PROTO_UNDEF } else { set->variant = _TOKEN(HTYPE, 6_variant); set->dsize = ip_set_elem_len(set, tb, -sizeof(struct IPSET_TOKEN(HTYPE, 6_elem))); +IP_SET_BASE_ALIGN(IPSET_TOKEN(HTYPE, 6_elem))); } #endif if (tb[IPSET_ATTR_TIMEOUT]) { If that does not solve it, then could you help to narrow down the issue? Does the bug still appear if your remove the counter extension of the set? Hello Jozsef, Patch applied well, compiling ... Hello Jozsef, Thank you for the patch it but still crashes, see: https://bugzilla.redhat.com/show_bug.cgi?id=1272645 Any further ideas? Thank you. Ciao, Gerhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 4.2.4
On 25.10.2015 20:46, Jozsef Kadlecsik wrote: Hi, On Sun, 25 Oct 2015, Gerhard Wiesinger wrote: On 25.10.2015 10:46, Willy Tarreau wrote: ipset *triggered* the problem. The whole stack dump would tell more. OK, find the stack traces in the bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1272645 Kernel 4.1.10 triggered also a kernel dump when playing with ipset commands and IPv6, details in the bug report It seems to me it is an architecture-specific alignment issue. I don't have a Cortex-A7 ARM hardware and qemu doesn't seem to support it either, so I'm unable to reproduce it (ipset passes all my tests on my hardware, including more complex ones than what breaks here). My first wild guess is that the dynamic array of the element structure is not aligned properly. Could you give a try to the next patch? diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h index afe905c..1cf357d 100644 --- a/net/netfilter/ipset/ip_set_hash_gen.h +++ b/net/netfilter/ipset/ip_set_hash_gen.h @@ -1211,6 +1211,9 @@ static const struct ip_set_type_variant mtype_variant = { .same_set = mtype_same_set, }; +#define IP_SET_BASE_ALIGN(dtype) \ + ALIGN(sizeof(struct dtype), __alignof__(struct dtype)) + #ifdef IP_SET_EMIT_CREATE static int IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set, @@ -1319,12 +1322,12 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set, #endif set->variant = _TOKEN(HTYPE, 4_variant); set->dsize = ip_set_elem_len(set, tb, - sizeof(struct IPSET_TOKEN(HTYPE, 4_elem))); + IP_SET_BASE_ALIGN(IPSET_TOKEN(HTYPE, 4_elem))); #ifndef IP_SET_PROTO_UNDEF } else { set->variant = _TOKEN(HTYPE, 6_variant); set->dsize = ip_set_elem_len(set, tb, - sizeof(struct IPSET_TOKEN(HTYPE, 6_elem))); + IP_SET_BASE_ALIGN(IPSET_TOKEN(HTYPE, 6_elem))); } #endif if (tb[IPSET_ATTR_TIMEOUT]) { If that does not solve it, then could you help to narrow down the issue? Does the bug still appear if your remove the counter extension of the set? Hello Jozsef, Patch applied well, compiling ... Interesting, that it didn't happen before. Device is in production for more than 2 month without any issue. Also any idea regarding the second isssue? Or do you think it has the same root cause? Greetings from Vienna, Austria :-) BTW: You can get the Banana Pi R1 for example at: http://www.aliexpress.com/item/BPI-R1-Set-1-R1-Board-Clear-Case-5dB-Antenna-Power-Adapter-Banana-PI-R1-Smart/32362127917.html I can really recommend it as a router. Power consumption is as less as 3W. Price is also IMHO very good. Ciao, Gerhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 4.2.4
On 25.10.2015 17:29, Greg KH wrote: On Sun, Oct 25, 2015 at 11:48:54AM +0100, Gerhard Wiesinger wrote: On 25.10.2015 10:46, Willy Tarreau wrote: ipset *triggered* the problem. The whole stack dump would tell more. OK, find the stack traces in the bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1272645 Kernel 4.1.10 triggered also a kernel dump when playing with ipset commands and IPv6, details in the bug report Kernel 4.2 seems to me not well tested in the netfilter parts at all (Bug with already known bugfix https://lists.debian.org/debian-kernel/2015/10/msg00034.html was triggered on 2 of 3 of my machines, the new bug on 1 of 1 tested machine). There's a reason why Greg maintains stable and LTS kernels :-) Stable kernels don't crash but definiton. :-) At least triggered 2 kernel panics in 5min, even with 4.1.10 and ipset commands ... Does this happen also with Linus's tree? I suggest you ask the networking developers about this on net...@vger.kernel.org, there's nothing that I can do on my own about this, sorry. Already CCed netdev and netfilter-devel mailinglist. Need patches for the switch driver of the banana Pi to get networking up but that patch is stable. Maybe also some patches from the Fedora SRPMS are needed. But I'm pretty sure that this also happens with plain vanilla kernel. Ciao, Gerhard -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/