Re: interesting troubleshooting

2020-03-24 Thread Brandon Martin
On 3/20/20 5:57 PM, Jared Mauch wrote: > It’s the protocol 50 IPSEC VPNs. They are very sensitive to path changes and > reordering as well. Is there a reason these are so sensitive to re-ordering or path changes? ESP should just encap whatever is underneath it on a packet-by-packet basis and

RE: interesting troubleshooting

2020-03-23 Thread adamv0025
> Saku Ytti > Sent: Saturday, March 21, 2020 4:26 PM > > On Sat, 21 Mar 2020 at 18:19, Mark Tinka wrote: > > > So the three or four times we tried to get FAT going (in a > > multi-vendor network), it simply didn't work. > > Yeah we run it in a multivendor network (JNPR, CSCO, NOK), works. > >

Re: interesting troubleshooting

2020-03-22 Thread Mark Tinka
On 22/Mar/20 19:17, Saku Ytti wrote: > You don't need both. My rule of thumb, green field, go with entropy > and get all the services in one go. Brown field, go FAT, and target > just PW, ensure you also have CW, then let transit LSR balance > MPLS-IP. With entropy label you can entirely

Re: interesting troubleshooting

2020-03-22 Thread Saku Ytti
On Sun, 22 Mar 2020 at 16:25, Mark Tinka wrote: > So the latter. We used both FAT + entropy to provide even load balancing > of l2vpn payloads in the edge and core, with little success. You don't need both. My rule of thumb, green field, go with entropy and get all the services in one go. Brown

Re: interesting troubleshooting

2020-03-22 Thread Mark Tinka
On 22/Mar/20 11:52, Saku Ytti wrote: > So you're not even talking about multivendor, as both ends are JNPR? > Or are you confusing entropy label with FAT? Some cases were MX480 to ASR920, but most were MX480 to MX480, either transiting CRS. > > Transit doesn't know anything about FAT, FAT

Re: interesting troubleshooting

2020-03-22 Thread Saku Ytti
On Sun, 22 Mar 2020 at 09:41, Mark Tinka wrote: > We weren't as successful (MX480 ingress/egress devices transiting a CRS > core). So you're not even talking about multivendor, as both ends are JNPR? Or are you confusing entropy label with FAT? Transit doesn't know anything about FAT, FAT is

Re: interesting troubleshooting

2020-03-22 Thread Saku Ytti
Hey Tassos, On Sat, 21 Mar 2020 at 22:51, Tassos Chatzithomaoglou wrote: > Yep, the RFC gives this option. > Does Juniper MX/ACX series support it? > I know for sure Cisco doesn't. I only run bidir, which Cisco do you mean? ASR9k allows you to configure it. both Insert/Discard Flow

Re: interesting troubleshooting

2020-03-22 Thread Matthew Petach
On Sat, Mar 21, 2020 at 12:53 AM Saku Ytti wrote: > Hey Matthew, > > > There are *several* caveats to doing dynamic monitoring and remapping of > > flows; one of the biggest challenges is that it puts extra demands on the > > line cards tracking the flows, especially as the number of flows rises

Re: interesting troubleshooting

2020-03-22 Thread Mark Tinka
On 22/Mar/20 10:08, Adam Atkinson wrote: > > I don't know how well-known this is, and it may not be something many > people would want to do, but Enterasys switches, now part of Extreme's > portfolio, allow "round-robin" as a load-sharing algorithm on LAGs. > > see e.g. > >

Re: interesting troubleshooting

2020-03-22 Thread Adam Atkinson
On 20/03/2020 21:33, Nimrod Levy wrote: I was contacted by my NOC to investigate a LAG that was not distributing traffic evenly among the members to the point where one member was congested while the utilization on the LAG was reasonably low. I don't know how well-known this is, and it may

Re: interesting troubleshooting

2020-03-22 Thread Mark Tinka
On 21/Mar/20 18:25, Saku Ytti wrote: > Yeah we run it in a multivendor network (JNPR, CSCO, NOK), works. > > I would also recommend people exclusively using CW+FAT and disabling > LSR payload heuristics (JNPR default, but by default won't do with CW, > can do with CW too). We weren't as

Re: interesting troubleshooting

2020-03-21 Thread Tassos Chatzithomaoglou
Saku Ytti wrote on 21/3/20 19:04: > On Sat, 21 Mar 2020 at 18:55, Tassos Chatzithomaoglou > wrote: > >> I still don't understand why the vendors cannot make it work in one >> direction only (the low-end platform would only need to remove an extra >> label, no need to inspect traffic). >> That

Re: interesting troubleshooting

2020-03-21 Thread Christopher Morrow
(skipping up the thread some) On Fri, Mar 20, 2020 at 5:58 PM Jared Mauch wrote: > It’s the protocol 50 IPSEC VPNs. They are very sensitive to path changes and > reordering as well. > > If you’re tunneling more than 5 or 10Gb/s of IPSEC it’s likely going to be a > bad day when you find a low

Re: interesting troubleshooting

2020-03-21 Thread Saku Ytti
On Sat, 21 Mar 2020 at 18:55, Tassos Chatzithomaoglou wrote: > I still don't understand why the vendors cannot make it work in one direction > only (the low-end platform would only need to remove an extra label, no need > to inspect traffic). > That would help us a lot, since the majority of

Re: interesting troubleshooting

2020-03-21 Thread Tassos Chatzithomaoglou
Mark Tinka wrote on 21/3/20 18:15: > So the three or four times we tried to get FAT going (in a multi-vendor > network), it simply didn't work. > > Have you (or anyone else) had any luck with it, in practice? > > Mark. > Only between Cisco boxes. I still don't understand why the vendors cannot

Re: interesting troubleshooting

2020-03-21 Thread Saku Ytti
On Sat, 21 Mar 2020 at 18:19, Mark Tinka wrote: > So the three or four times we tried to get FAT going (in a multi-vendor > network), it simply didn't work. Yeah we run it in a multivendor network (JNPR, CSCO, NOK), works. I would also recommend people exclusively using CW+FAT and disabling

Re: interesting troubleshooting

2020-03-21 Thread Mark Tinka
On 21/Mar/20 09:58, Saku Ytti wrote: > No. > > FAT adds additional MPLS label for entropy, ingressPE calculates flow > hash, based on traditional flow keys and injects that flow number as > MPLS label, so transit LSR can use MPLS labels for balancing, without > being able to parse the frame.

Re: interesting troubleshooting

2020-03-21 Thread Saku Ytti
On Sat, 21 Mar 2020 at 04:20, Steve Meuse wrote: > What that large flow in a single LSP? Is this something that FAT lsp would > fix? No. FAT adds additional MPLS label for entropy, ingressPE calculates flow hash, based on traditional flow keys and injects that flow number as MPLS label, so

Re: interesting troubleshooting

2020-03-21 Thread Saku Ytti
Hey Matthew, > There are *several* caveats to doing dynamic monitoring and remapping of > flows; one of the biggest challenges is that it puts extra demands on the > line cards tracking the flows, especially as the number of flows rises to > large values. I recommend reading >

Re: interesting troubleshooting

2020-03-20 Thread William Herrin
On Fri, Mar 20, 2020 at 3:07 PM Job Snijders wrote: > Do we know which specific VPN technologies specifically are harder to > hash in a meaningful way for load balanacing purposes, than others? I would expect it to be true of any site to site VPN data flow. The whole idea is for the guy in the

Re: interesting troubleshooting

2020-03-20 Thread Steve Meuse
What that large flow in a single LSP? Is this something that FAT lsp would fix? -Steve On Fri, Mar 20, 2020 at 5:33 PM Nimrod Levy wrote: > I just ran into an issue that I thought was worth sharing with the NANOG > community. With recently increased visibility on keeping the Internet >

Re: interesting troubleshooting

2020-03-20 Thread Job Snijders
On Fri, Mar 20, 2020 at 05:57:19PM -0400, Jared Mauch wrote: > You also need to watch out to ensure you’re not on some L2VPN type > product that bumps up against a barrier. I know it’s a stressful time > for many networks and systems people as traffic shifts. A few years ago we did a

Re: interesting troubleshooting

2020-03-20 Thread Matthew Petach
On Fri, Mar 20, 2020 at 3:09 PM Saku Ytti wrote: > Hey Nimrod, > > > I was contacted by my NOC to investigate a LAG that was not distributing > traffic evenly among the members to the point where one member was > congested while the utilization on the LAG was reasonably low. Looking at > my

Re: interesting troubleshooting

2020-03-20 Thread Chris Adams
Once upon a time, Nimrod Levy said: > With the increase in remote workers and VPN traffic that won't hash across > multiple paths, I thought this anecdote might help someone else track down > a problem that might not be so obvious. Last week I ran into an issue where traffic between my home and

Re: interesting troubleshooting

2020-03-20 Thread Saku Ytti
Hey Nimrod, > I was contacted by my NOC to investigate a LAG that was not distributing > traffic evenly among the members to the point where one member was congested > while the utilization on the LAG was reasonably low. Looking at my netflow > data, I was able to confirm that this was caused

Re: interesting troubleshooting

2020-03-20 Thread Jared Mauch
> On Mar 20, 2020, at 5:50 PM, Job Snijders wrote: > > On Fri, Mar 20, 2020 at 05:33:31PM -0400, Nimrod Levy wrote: >> With the increase in remote workers and VPN traffic that won't hash across >> multiple paths, I thought this anecdote might help someone else track down >> a problem that

Re: interesting troubleshooting

2020-03-20 Thread Job Snijders
On Fri, Mar 20, 2020 at 05:33:31PM -0400, Nimrod Levy wrote: > With the increase in remote workers and VPN traffic that won't hash across > multiple paths, I thought this anecdote might help someone else track down > a problem that might not be so obvious. Do we know which specific VPN

interesting troubleshooting

2020-03-20 Thread Nimrod Levy
I just ran into an issue that I thought was worth sharing with the NANOG community. With recently increased visibility on keeping the Internet running smoothly, I thought that sharing this small experience could benefit everyone. I was contacted by my NOC to investigate a LAG that was not