Re: pflow on PE router
On 2021-06-06, Patrick Dohman wrote: > Perhaps it has something to do with Citrix being a dinosaur. > God forbid the powers that be choose on premise unix. > Regards > Patrick Your message doesn't appear to relate in any way to the message to which you're replying. >> On Jun 4, 2021, at 6:43 AM, Stuart Henderson wrote: >> >> On 2021/06/03 15:04, Chris Cappuccio wrote: >>> Stuart Henderson [s...@spacehopper.org] wrote: Oh watch out with sloppy. Keep an eye on your state table size. >>> >>> Really? Wouldn't sloppy keep the state table smaller if anything since it's >>> tracking less specifically? >>> >>> Anyways I use sloppy across four boxes that run in parallel with pfsync. >>> There could easily be 10,000 devices behind it at any given time. I keep my >>> state table limit at 1,000,000. It's around 300,000 during this lighter >>> traffic period today. I had to do sloppy after moving to several boxes in >>> parallel, I didn't notice sloppy making any significant difference? >>> >>> Chris >> >> The problem I had was in conjunction with synfloods. I didn't get >> captures for everything to figure it out (it was in 2018 and my >> network was in flames, with the full state table bgp sessions were >> getting dropped / not reestablishing) but I think what happened was >> this, >> >> spoofed SYN to real server behind PF >> SYN+ACK from server >> >> and the state entry ended up as ESTABLISHED:ESTABLISHED where it >> remained until the tcp.established timer expired (24h default >> or 5h with "set optimization aggressive"). >> >> My "fix" was to move as much as possible to "pass XX flags any no state" >> but that's clearly not going to help with what Denis would like to do. >> (fwiw - I'm not doing flow monitoring regularly, but when I do it's >> usually via sflow on switches instead, which solves some problems, >> though it's only possible in some situations). >> > >
Re: pflow on PE router
Perhaps it has something to do with Citrix being a dinosaur. God forbid the powers that be choose on premise unix. Regards Patrick > On Jun 4, 2021, at 6:43 AM, Stuart Henderson wrote: > > On 2021/06/03 15:04, Chris Cappuccio wrote: >> Stuart Henderson [s...@spacehopper.org] wrote: >>> >>> Oh watch out with sloppy. Keep an eye on your state table size. >> >> Really? Wouldn't sloppy keep the state table smaller if anything since it's >> tracking less specifically? >> >> Anyways I use sloppy across four boxes that run in parallel with pfsync. >> There could easily be 10,000 devices behind it at any given time. I keep my >> state table limit at 1,000,000. It's around 300,000 during this lighter >> traffic period today. I had to do sloppy after moving to several boxes in >> parallel, I didn't notice sloppy making any significant difference? >> >> Chris > > The problem I had was in conjunction with synfloods. I didn't get > captures for everything to figure it out (it was in 2018 and my > network was in flames, with the full state table bgp sessions were > getting dropped / not reestablishing) but I think what happened was > this, > > spoofed SYN to real server behind PF > SYN+ACK from server > > and the state entry ended up as ESTABLISHED:ESTABLISHED where it > remained until the tcp.established timer expired (24h default > or 5h with "set optimization aggressive"). > > My "fix" was to move as much as possible to "pass XX flags any no state" > but that's clearly not going to help with what Denis would like to do. > (fwiw - I'm not doing flow monitoring regularly, but when I do it's > usually via sflow on switches instead, which solves some problems, > though it's only possible in some situations). >
Re: pflow on PE router
On 2021/06/03 15:04, Chris Cappuccio wrote: > Stuart Henderson [s...@spacehopper.org] wrote: > > > > Oh watch out with sloppy. Keep an eye on your state table size. > > Really? Wouldn't sloppy keep the state table smaller if anything since it's > tracking less specifically? > > Anyways I use sloppy across four boxes that run in parallel with pfsync. > There could easily be 10,000 devices behind it at any given time. I keep my > state table limit at 1,000,000. It's around 300,000 during this lighter > traffic period today. I had to do sloppy after moving to several boxes in > parallel, I didn't notice sloppy making any significant difference? > > Chris The problem I had was in conjunction with synfloods. I didn't get captures for everything to figure it out (it was in 2018 and my network was in flames, with the full state table bgp sessions were getting dropped / not reestablishing) but I think what happened was this, spoofed SYN to real server behind PF SYN+ACK from server and the state entry ended up as ESTABLISHED:ESTABLISHED where it remained until the tcp.established timer expired (24h default or 5h with "set optimization aggressive"). My "fix" was to move as much as possible to "pass XX flags any no state" but that's clearly not going to help with what Denis would like to do. (fwiw - I'm not doing flow monitoring regularly, but when I do it's usually via sflow on switches instead, which solves some problems, though it's only possible in some situations).
Re: pflow on PE router
Stuart Henderson [s...@spacehopper.org] wrote: > > Oh watch out with sloppy. Keep an eye on your state table size. Really? Wouldn't sloppy keep the state table smaller if anything since it's tracking less specifically? Anyways I use sloppy across four boxes that run in parallel with pfsync. There could easily be 10,000 devices behind it at any given time. I keep my state table limit at 1,000,000. It's around 300,000 during this lighter traffic period today. I had to do sloppy after moving to several boxes in parallel, I didn't notice sloppy making any significant difference? Chris
Re: pflow on PE router
I suspect that you’ll be out of luck until TLSv1.3 is implemented. I’ve found the same to be true with the new 10 gb sfp switches in our infrastructure which surprisingly still implement TLSv1.0 & broken CGI web server. Regards Patrick > On Jun 1, 2021, at 3:44 PM, Stuart Henderson wrote: > > On 2021-05-30, Denis Fondras wrote: >> Le Fri, May 28, 2021 at 03:30:58PM -0700, Chris Cappuccio a écrit : >>> You might try "set state-defaults pflow, sloppy", also in some scenarios >>> you >>> might need "set state-policy floating" >>> >>> If "sloppy" fixes it, there may be some bugs to hunt. >>> >> >> "sloppy" seems to fix the issue. I will do more tests this week before >> declaring >> victory :) >> >> Thank you Chris. >> >> > > Oh watch out with sloppy. Keep an eye on your state table size. >
Re: pflow on PE router
On 2021-05-30, Denis Fondras wrote: > Le Fri, May 28, 2021 at 03:30:58PM -0700, Chris Cappuccio a écrit : >> You might try "set state-defaults pflow, sloppy", also in some scenarios you >> might need "set state-policy floating" >> >> If "sloppy" fixes it, there may be some bugs to hunt. >> > > "sloppy" seems to fix the issue. I will do more tests this week before > declaring > victory :) > > Thank you Chris. > > Oh watch out with sloppy. Keep an eye on your state table size.
Re: pflow on PE router
Denis Fondras [open...@ledeuns.net] wrote: > > "sloppy" seems to fix the issue. I will do more tests this week before > declaring > victory :) > If that really works, then there could be a problem with PF sequence number tracking. Can you develop a specific sequence of events to reproduce the failures?
Re: pflow on PE router
> "sloppy" seems to fix the issue. I will do more tests this week before > declaring > victory :) > > Thank you Chris. > Get somme ;) Regards Patrick
Re: pflow on PE router
Le Fri, May 28, 2021 at 03:30:58PM -0700, Chris Cappuccio a écrit : > You might try "set state-defaults pflow, sloppy", also in some scenarios you > might need "set state-policy floating" > > If "sloppy" fixes it, there may be some bugs to hunt. > "sloppy" seems to fix the issue. I will do more tests this week before declaring victory :) Thank you Chris.
Re: pflow on PE router
Denis Fondras [open...@ledeuns.net] wrote: > Hello, > > I used OpenBSD as a PE router on my network. The router is connected to an > IX, a > transit and multiple peers with OpenBGPd. > > Earlier this week, I enabled pflow(4) to track traffic usage. > Unfortunately enabling pf(4) on a edge router does not seems like a good idea. > Some peers called in to tell they notice multiple problems (ranging from what > seems MTU problem to cut in lengthy TCP sessions), deactivating pf(4) > instantaneously fixed the problem on their side, reactivating pf($) and the > problems are back. > > I tried to push up the state table (I reached 300k states), to no avail. > > Do you know what are the "right settings" to have pflow(4) enabled on PE > router > ? Pflow requires pf to be enabled to create states otherwise there is nothing to export. You could use a different flow generator tool (there is at least one in ports) that will watch the traffic over bpf and generate flow data. You might try "set state-defaults pflow, sloppy", also in some scenarios you might need "set state-policy floating" If "sloppy" fixes it, there may be some bugs to hunt.
Re: pflow on PE router
Here are some more infos : >- does running pf(4) without pflow(4) cause issue? Yes, the issue is linked to pf(4) being enabled. >- can you confirm you were running with pf(4) disabled prior to enabling > pflow(4)? I do confirm. I never enable pf(4) on edge routers, it bit in the past with assymetric routing :) >- are you able to provide or indicate your pf.conf? --- /etc/pf.conf --- set state-defaults pflow set limit states 100 pass --- /etc/pf.conf --- >- how many pf(4) states are you seeing in # pfctl -s info ? what is the > removal rate? depending on the period of the day, it ranges from 300 to 30. The removal rate was 112761228.5/s when I disabled pf(4) again. >- was traffic to the pflow sink machine transiting MPLS? No, there is no MPLS involved at all. (I guess PE was not the right word, but edge router might have triggered Ubiquiti fans...) >- can you provide a dmesg I upgraded this morning, problem is still the same : OpenBSD 6.9-current (GENERIC.MP) #20: Sun May 16 00:32:45 MDT 2021 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 34228760576 (32643MB) avail mem = 33175949312 (31639MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xdab19000 (51 entries) bios0: vendor American Megatrends Inc. version "1.0c" date 06/30/2020 bios0: Supermicro AS -5019D-FTN4 acpi0 at bios0: ACPI 6.1 acpi0: sleep states S0 S5 acpi0: tables DSDT FACP APIC FPDT FIDT SSDT SPMI SSDT MCFG SSDT CRAT CDIT BERT EINJ HEST HPET SSDT UEFI SSDT WSMT acpi0: wakeup devices S0D0(S3) S0D1(S3) S0D2(S3) S0D3(S3) S1D0(S3) S1D1(S3) S1D2(S3) S1D3(S3) acpitimer0 at acpi0: 3579545 Hz, 32 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: AMD EPYC 3251 8-Core Processor, 2500.55 MHz, 17-01-02 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu0: 64KB 64b/line 4-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 8-way L2 cache cpu0: ITLB 64 4KB entries fully associative, 64 4MB entries fully associative cpu0: DTLB 64 4KB entries fully associative, 64 4MB entries fully associative cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=1.1, IBE cpu1 at mainbus0: apid 2 (application processor) cpu1: AMD EPYC 3251 8-Core Processor, 2500.01 MHz, 17-01-02 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu1: 64KB 64b/line 4-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 8-way L2 cache cpu1: ITLB 64 4KB entries fully associative, 64 4MB entries fully associative cpu1: DTLB 64 4KB entries fully associative, 64 4MB entries fully associative cpu1: smt 0, core 1, package 0 cpu2 at mainbus0: apid 4 (application processor) cpu2: AMD EPYC 3251 8-Core Processor, 2500.01 MHz, 17-01-02 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu2: 64KB 64b/line 4-way I-cache, 32KB 64b/line 8-way D-cache, 512KB 64b/line 8-way L2 cache cpu2: ITLB 64 4KB entries fully associative, 64 4MB entries fully associative cpu2: DTLB 64 4KB entries fully associative, 64 4MB entries fully associative cpu2: smt 0, core 2, package 0 cpu3 at mainbus0: apid 6 (application processor) cpu3: AMD EPYC 3251 8-Core Processor, 2500.01 MHz, 17-01-02 cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES cpu3: 64KB 64b/line 4-way I-cache,
pflow on PE router
Hello, I used OpenBSD as a PE router on my network. The router is connected to an IX, a transit and multiple peers with OpenBGPd. Earlier this week, I enabled pflow(4) to track traffic usage. Unfortunately enabling pf(4) on a edge router does not seems like a good idea. Some peers called in to tell they notice multiple problems (ranging from what seems MTU problem to cut in lengthy TCP sessions), deactivating pf(4) instantaneously fixed the problem on their side, reactivating pf($) and the problems are back. I tried to push up the state table (I reached 300k states), to no avail. Do you know what are the "right settings" to have pflow(4) enabled on PE router ? Thank you in advance, Denis