Re: pf_state_export panic with NET_TASKQ=6 and stuff ....

2022-05-27 Thread Alexandr Nedvedicky
Hello,


On Fri, May 27, 2022 at 10:33:06AM +0200, Hrvoje Popovski wrote:
> Hi all,
> 
> I'm running firewall in production with NET_TASKQ=6 with claudio@ "use
> timeout for rttimer" and bluhm@ "kernel lock in arp" diffs.
> After week or so of running smoothly I've got panic.

thank you for being brave enough to run those bits in production.



> bcbnfw1# uvm_fault(0x823c6ac0, 0x10, 0, 1) -> e
> kernel: page fault trap, code=0
> Stopped at  pf_state_export+0x4e:   movq0x10(%rax),%rcx

according to registers below rax is 0, we die because
of NULL pointer dereference.

> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
> *414231  37466  0 0x14000  0x2003  softnet
>  180795  96693  0 0x14000  0x2002  softnet
>   39487  54182  0 0x14000  0x2000  softnet
>  221352  95757  0 0x14000  0x2004  softnet
>  252845  32137  0 0x14000  0x2001  softnet
>  294301  63695  0 0x14000  0x2005  softnet
> pf_state_export(fd80611313c8,fd8877492ac0) at pf_state_export+0x4e
> pfsync_sendout() at pfsync_sendout+0x5e4
> pfsync_update_state(fd887df852b8) at pfsync_update_state+0x15b
> pf_test(2,1,80d48000,800020b23a08) at pf_test+0xd53
> ip_input_if(800020b23a08,800020b23a14,4,0,80d48000) at 
> ip_input_if+0xcd
> ipv4_input(80d48000,fd80774a4000) at ipv4_input+0x39
> ether_input(80d48000,fd80774a4000) at ether_input+0x3ad
> carp_input(80d64000,fd80774a4000,5e000115) at carp_input+0x196
> ether_input(80d64000,fd80774a4000) at ether_input+0x1d9
> vlan_input(80b9f000,fd80774a4000,800020b23c3c) at 
> vlan_input+0x23d
> ether_input(80b9f000,fd80774a4000) at ether_input+0x85
> if_input_process(80493048,800020b23cd8) at if_input_process+0x6f
> ifiq_process(80491b00) at ifiq_process+0x69
> taskq_thread(80036500) at taskq_thread+0x11a
> end trace frame: 0x0, count: 1
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{3}>
> 

according to call stack we die somewhere here:

1192
1193memset(sp, 0, sizeof(struct pfsync_state));
1194
1195/* copy from state key */
1196sp->key[PF_SK_WIRE].addr[0] = st->key[PF_SK_WIRE]->addr[0];
1197sp->key[PF_SK_WIRE].addr[1] = st->key[PF_SK_WIRE]->addr[1];
1198sp->key[PF_SK_WIRE].port[0] = st->key[PF_SK_WIRE]->port[0];
1199sp->key[PF_SK_WIRE].port[1] = st->key[PF_SK_WIRE]->port[1];
1200sp->key[PF_SK_WIRE].rdomain = 
htons(st->key[PF_SK_WIRE]->rdomain);
1201sp->key[PF_SK_WIRE].af = st->key[PF_SK_WIRE]->af;

looks like state key bound to st might be gone (st->key[] == NULL).
I'll take closer look later today.

thanks and
regards
sashan



pf_state_export panic with NET_TASKQ=6 and stuff ....

2022-05-27 Thread Hrvoje Popovski
Hi all,

I'm running firewall in production with NET_TASKQ=6 with claudio@ "use
timeout for rttimer" and bluhm@ "kernel lock in arp" diffs.
After week or so of running smoothly I've got panic.

I'm aware that it's not plain snapshot, but having two firewalls with
carp and pfsync gives me room for playing around and report back ...


Panic log in attachment

dmesg:
OpenBSD 7.1-current (GENERIC.MP) #24: Sun May 22 19:35:12 CEST 2022
hrv...@bcbnfw1.lan:/sys/arch/amd64/compile/GENERIC.MP
real mem = 34224844800 (32639MB)
avail mem = 32913821696 (31389MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0xec9b0 (62 entries)
bios0: vendor American Megatrends Inc. version "3.1c" date 05/02/2019
bios0: Supermicro X10SRW-F
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S4 S5
acpi0: tables DSDT FACP APIC FPDT FIDT SPMI MCFG UEFI HPET NFIT WDDT
SSDT NITR SSDT SSDT PRAD DMAR HEST BERT ERST EINJ
acpi0: wakeup devices IP2P(S4) EHC1(S4) EHC2(S4) RP01(S4) RP02(S4)
RP03(S4) RP04(S4) RP05(S4) RP06(S4) RP07(S4) RP08(S4) BR1A(S4) BR1B(S4)
BR2A(S4) BR2B(S4) BR2C(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz, 3600.55 MHz, 06-4f-01
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz, 3600.01 MHz, 06-4f-01
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz, 3600.01 MHz, 06-4f-01
cpu2:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz, 3600.00 MHz, 06-4f-01
cpu3:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 0, core 3, package 0
cpu4 at mainbus0: apid 8 (application processor)
cpu4: Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz, 3600.00 MHz, 06-4f-01
cpu4:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu4: 256KB 64b/line 8-way L2 cache
cpu4: smt 0, core 4, package 0
cpu5 at mainbus0: apid 10 (application processor)
cpu5: Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz, 3600.00 MHz, 06-4f-01
cpu5: