Re: interesting troubleshooting

2020-03-24 Thread Brandon Martin
On 3/20/20 5:57 PM, Jared Mauch wrote:
> It’s the protocol 50 IPSEC VPNs.  They are very sensitive to path changes and 
> reordering as well.

Is there a reason these are so sensitive to re-ordering or path changes?  ESP 
should just encap whatever is underneath it on a packet-by-packet basis and be 
relatively stateless on its own unless folks are super strictly enforcing 
sequence numbering (maybe this is common?).  I can understand that some of the 
underlying protocols in use, especially LAN protocols like SMB/CIFS, might not 
really like re-ordering or public-Internet-like jitter and delay changes, but 
that's going to be the case with any transparent VPN and is one of SMB/CIFS 
many flaws.

For LAGs where both endpoints are on the same gear (either the same box/chassis 
or a multi-chassis virtual setup where both planes are geographically local) 
and all links traverse the same path i.e. the LAG is purely for capacity, I've 
always wondered by round-robin isn't more common.  That will re-order by at 
worst the number of links in the LAG, and if the links are much faster and well 
utilized compared to the sub-flows, I'd expect the re-ordering to be minimal 
even then though I haven't done the math to show it and might be wrong.

I'd argue that any remote access VPN product that can't handle minor packet 
re-ordering is sufficiently flawed as to be useless.  Systems designed for very 
controlled deployment on a long-term point-to-point basis are perhaps excepted, 
here.
-- 
Brandon Martin


RE: interesting troubleshooting

2020-03-23 Thread adamv0025
> Saku Ytti
> Sent: Saturday, March 21, 2020 4:26 PM
> 
> On Sat, 21 Mar 2020 at 18:19, Mark Tinka  wrote:
> 
> > So the three or four times we tried to get FAT going (in a
> > multi-vendor network), it simply didn't work.
> 
> Yeah we run it in a multivendor network (JNPR, CSCO, NOK), works.
> 
> I would also recommend people exclusively using CW+FAT and disabling LSR
> payload heuristics (JNPR default, but by default won't do with CW, can do
> with CW too).
> 
And I'd add entropy labels too -for L3VPN traffic.
Using all this you know where to look (at PE edge) for any hashing related 
problems.

adam



Re: interesting troubleshooting

2020-03-22 Thread Mark Tinka



On 22/Mar/20 19:17, Saku Ytti wrote:

> You don't need both. My rule of thumb, green field, go with entropy
> and get all the services in one go. Brown field, go FAT, and target
> just PW, ensure you also have CW, then let transit LSR balance
> MPLS-IP. With entropy label you can entirely disable transit LSR
> payload heuristics.

We moved to our current strategy back in 2015/2016, after running
through multiple combinations of FAT and entropy.

I'm curious to give it another go in 2020, but if I'm honest, I'm
pleased with the simplicity of our current setup.

Mark.


Re: interesting troubleshooting

2020-03-22 Thread Saku Ytti
On Sun, 22 Mar 2020 at 16:25, Mark Tinka  wrote:

> So the latter. We used both FAT + entropy to provide even load balancing
> of l2vpn payloads in the edge and core, with little success.

You don't need both. My rule of thumb, green field, go with entropy
and get all the services in one go. Brown field, go FAT, and target
just PW, ensure you also have CW, then let transit LSR balance
MPLS-IP. With entropy label you can entirely disable transit LSR
payload heuristics.

-- 
  ++ytti


Re: interesting troubleshooting

2020-03-22 Thread Mark Tinka



On 22/Mar/20 11:52, Saku Ytti wrote:

> So you're not even talking about multivendor, as both ends are JNPR?
> Or are you confusing entropy label with FAT?

Some cases were MX480 to ASR920, but most were MX480 to MX480, either
transiting CRS.


>
> Transit doesn't know anything about FAT, FAT is PW specific and is
> only signalled between end-points. Entropy label applies to all
> services and is signalled to adjacent device. Transit just sees 1
> label longer label stack, with hope (not promise) that transit uses
> the additional label for hashing.

So the latter. We used both FAT + entropy to provide even load balancing
of l2vpn payloads in the edge and core, with little success.



> You really should be doing CW+FAT.

Yeah - just going back to basics with ECMP worked well, and I'd prefer
to use solutions that are less exotic as possible.


>  And looking your other email, dear
> god, don't do per-packet outside some unique application where you
> control the TCP stack :). Modern Windows, Linux, MacOS TCP stack
> considers out-of-order as packet loss, this is not inherent to TCP, if
> you can change TCP congestion control, you can make reordering
> entirely irrelevant to TCP. But in most cases of course we do not
> control TCP algo, so per-packet will not work one bit.

Like I said, that was 2014. We tested it for a couple of months, mucked
around as much as we could, and decided it wasn't worth the bother.


>
> Like OP, you should enable adaptive.

That's what I said we are doing since 2014, unless I wasn't clear.

Mark.



Re: interesting troubleshooting

2020-03-22 Thread Saku Ytti
On Sun, 22 Mar 2020 at 09:41, Mark Tinka  wrote:

> We weren't as successful (MX480 ingress/egress devices transiting a CRS
> core).

So you're not even talking about multivendor, as both ends are JNPR?
Or are you confusing entropy label with FAT?

Transit doesn't know anything about FAT, FAT is PW specific and is
only signalled between end-points. Entropy label applies to all
services and is signalled to adjacent device. Transit just sees 1
label longer label stack, with hope (not promise) that transit uses
the additional label for hashing.

> In the end, we updated our policy to avoid running LAG's in the
> backbone, and going ECMP instead. Even with l2vpn payloads, that spreads
> a lot more evenly.

You really should be doing CW+FAT. And looking your other email, dear
god, don't do per-packet outside some unique application where you
control the TCP stack :). Modern Windows, Linux, MacOS TCP stack
considers out-of-order as packet loss, this is not inherent to TCP, if
you can change TCP congestion control, you can make reordering
entirely irrelevant to TCP. But in most cases of course we do not
control TCP algo, so per-packet will not work one bit.

Like OP, you should enable adaptive. This thread is conflating few
different balancing issues, so I'll take the opportunity to classify
them.

1. Bad hashing implementation
1.1 Insufficient amount of hash-results
Think say 6500/7600, what if you only have 8 hash-results and
7 interfaces? You will inherently have 2x more traffic on one
interface
1.2 Bad algorithm
Different hashes have different use-cases, and we often try to
think golden-hammer for hashes (like we tend to use bad hashes for
password hashing, like SHA etc, when goal of SHA is to be fast in HW,
which is opposite to the goal of PW hash, as you want it to be slow).
Equally since the day1 of ethernet silicon, we've had CRC in the
slicion, and it has since then been grandfathered hash load-balancing
hash. But CRC goals are completely different to hash-algo goals, CRC
does not try, and does not need good diffusion quality, hash-algo only
needs perfect diffusion, nothing else matters. CRC has terrible
diffusion quality, instead of implementing specific good-diffusion
hash in silicon vendors do stuff like rot(crcN(x), crcM(x)) which
greatly improves diffusion, but is still very bad diffusion compared
to hash algos which are designed for perfect diffusion. Poor diffusion
means you have different flow count in egressInts. As I can't do math,
I did monte-carlo simulation to see what type of bias should we expect
even with _perfect_ diffusion:

- Here we have 3 egressInt and we run monte carlo until we stop
getting worse Bias (of course if we wait for heath death of universe,
we will see eventually see every flow in singleInt, even with perfect
diffusion). But in normal situation if you see worse bias, you should
blame poor diffusion quality of vendor algo, if you see bias of this
or lower, it's probably not diffusion you should blame

Flows | MaxBias | Example Flow Count per Int
1k | 6.9% | 395, 341, 264
10k | 2.2% | 3490, 3396, 3114
100k |0.6% |  33655, 32702, 33643
1M | 0.2% | 334969, 332424, 332607


2. Elephant flows
Even if we assume perfect diffusion, so each egressInt gets
exactly same amount of flows, the flows may still be wildly different
bps, and there is nothing we do by tuning the hash algo to fix this.
The prudent fix here is to have mapping-table between hash-result and
egressInt, so that we can inject bias, not to have fair distribution
between hash-result and egressInt, but to have fewer hash-results
point to the congested egressInt. This is easy, ~free to implement in
HW. JNPR does it, NOK is happy to implement should customer want it.
This of course also fixes bad algorithmic diffusion, so it's really
really great tool to have in your toolbox and I think everyone should
be running this feature.


3. Incorrect key recovery
   Balancing is promise that we know which keys identify a flow. In
common case this is simple problem, but there are lot of complexity
particularly in MPLS transit. The naive/simple problem everyone knows
about is pseudowire flow in-transit parsed as IPv4/IPv6 flow, when
DMAC starts with 4 or 6. Some vendors (JNPR, Huawei) do additional
checks, like perhaps IP checksum or IP packet length, but this is
actually making the situation worse, the problem triggers far less
often, but when it triggers, it will be so much more exotic, as now
you have underlaying frame where by luck you also have your IP packet
length supposedly correct. So you can end up in weird situations where
end-customers network works perfectly, then they implement IPSEC from
all hosts to concentrator, still riding over your backbone, and now
suddently one customer host stops working, after enabling IPSEC,
everything else works. The chances that this trouble-ticket even ever
ends on your table is low and the possibility that based on the
problem description  you'd blame the 

Re: interesting troubleshooting

2020-03-22 Thread Saku Ytti
Hey Tassos,

On Sat, 21 Mar 2020 at 22:51, Tassos Chatzithomaoglou
 wrote:

> Yep, the RFC gives this option.
> Does Juniper MX/ACX series support it?
> I know for sure Cisco doesn't.

I only run bidir, which Cisco do you mean? ASR9k allows you to configure it.

  both  Insert/Discard Flow label on transmit/recceive
  code  Flow label TLV code
  receive   Discard Flow label on receive
  transmit  Insert Flow label on transmit

JunOS as well:

  flow-label-receive   Advertise capability to pop Flow Label in
receive direction to remote PE
  flow-label-receive-static  Pop Flow Label from PW packets received
from remote PE
  flow-label-transmit  Advertise capability to push Flow Label in
transmit direction to remote PE
  flow-label-transmit-static  Push Flow Label on PW packets sent to remote PE


RP/0/RP0/CPU0:r14.labxtx01.us.(config-l2vpn-pwc-mpls)#do show l2vpn
xconnect interface Te0/2/0/3/7.1000 detail
..

  PW: neighbor 204.42.110.29, PW ID 1290, state is up ( established )
PW class ethernet-ccc, XC ID 0xa025
Encapsulation MPLS, protocol LDP
Source address 204.42.110.15
PW type Ethernet, control word disabled, interworking none
PW backup disable delay 0 sec
Sequencing not set
LSP : Up
Load Balance Hashing: src-dst-ip
Flow Label flags configured (Tx=1,Rx=0), negotiated (Tx=1,Rx=0)



y...@r28.labxtx01.us.bb# run show l2circuit connections interface et-0/0/54:3.0
...
Neighbor: 204.42.110.15
Interface Type  St Time last up  # Up trans
et-0/0/54:3.0(vc 1290)rmt   Up Mar 20 04:06:45 2020   7
  Remote PE: 204.42.110.15, Negotiated control-word: No
  Incoming label: 585, Outgoing label: 24003
  Negotiated PW status TLV: No
  Local interface: et-0/0/54:3.0, Status: Up, Encapsulation: ETHERNET
Description: BD: wmccall ixia 1-1
  Flow Label Transmit: No, Flow Label Receive: Yes
...


I didn't push bits, but at least I can signal unidir between ASR9k and PTX1k.


-- 
  ++ytti


Re: interesting troubleshooting

2020-03-22 Thread Matthew Petach
On Sat, Mar 21, 2020 at 12:53 AM Saku Ytti  wrote:

> Hey Matthew,
>
> > There are *several* caveats to doing dynamic monitoring and remapping of
> > flows; one of the biggest challenges is that it puts extra demands on the
> > line cards tracking the flows, especially as the number of flows rises to
> > large values.  I recommend reading
> >
> https://www.juniper.net/documentation/en_US/junos/topics/topic-map/load-balancing-aggregated-ethernet-interfaces.html#id-understanding-aggregated-ethernet-load-balancing
> > before configuring it.
>
> You are confusing two features. Stateful and adaptive. I was proposing
> adaptive, which just remaps the table, which is free, it is not flow
> aware. Amount of flow results is very small bound number, amount of
> states is very large unbound number.
>

Ah, apologies--you are right, I scanned down the linked document too
quickly,
thinking it was a single set of configuration notes.

Thanks for setting me straight on that.

Matt


>
> --
>   ++ytti
>
>


Re: interesting troubleshooting

2020-03-22 Thread Mark Tinka



On 22/Mar/20 10:08, Adam Atkinson wrote:

>
> I don't know how well-known this is, and it may not be something many
> people would want to do, but Enterasys switches, now part of Extreme's
> portfolio, allow "round-robin" as a load-sharing algorithm on LAGs.
>
> see e.g.
>
> https://gtacknowledge.extremenetworks.com/articles/How_To/How-to-configure-LACP-Output-Algorithm-as-Round-Robin
>
>
> This may not be the only product line supporting this.

So Junos does support both per-flow and per-packet load balancing on
LAG's on Trio line cards.

We tested this back in 2014 for a few months, and while the spread is
excellent (obviously), it creates a lot of out-of-order frame delivery
conditions, and all the pleasure & joy that goes along with that.

So we switched back to per-flow load balancing, and more recently, where
we run LAG's (802.1Q trunks between switches and an MX480 in the data
centre), we've gone 100Gbps so we don't have to deal with all this
anymore :-).

Mark.


Re: interesting troubleshooting

2020-03-22 Thread Adam Atkinson

On 20/03/2020 21:33, Nimrod Levy wrote:


I was contacted by my NOC to investigate a LAG that was not distributing
traffic evenly among the members to the point where one member was
congested while the utilization on the LAG was reasonably low.


I don't know how well-known this is, and it may not be something many 
people would want to do, but Enterasys switches, now part of Extreme's 
portfolio, allow "round-robin" as a load-sharing algorithm on LAGs.


see e.g.

https://gtacknowledge.extremenetworks.com/articles/How_To/How-to-configure-LACP-Output-Algorithm-as-Round-Robin

This may not be the only product line supporting this.



Re: interesting troubleshooting

2020-03-22 Thread Mark Tinka



On 21/Mar/20 18:25, Saku Ytti wrote:

> Yeah we run it in a multivendor network (JNPR, CSCO, NOK), works.
>
> I would also recommend people exclusively using CW+FAT and disabling
> LSR payload heuristics (JNPR default, but by default won't do with CW,
> can do with CW too).

We weren't as successful (MX480 ingress/egress devices transiting a CRS
core).

In the end, we updated our policy to avoid running LAG's in the
backbone, and going ECMP instead. Even with l2vpn payloads, that spreads
a lot more evenly.

Mark.


Re: interesting troubleshooting

2020-03-21 Thread Tassos Chatzithomaoglou


Saku Ytti wrote on 21/3/20 19:04:
> On Sat, 21 Mar 2020 at 18:55, Tassos Chatzithomaoglou
>  wrote:
>
>> I still don't understand why the vendors cannot make it work in one 
>> direction only (the low-end platform would only need to remove an extra 
>> label, no need to inspect traffic).
>> That would help us a lot, since the majority of our traffic is downstream to 
>> the customer.
> It is signalled separately for TX and RX and some vendors do allow you
> to signal it separately.
>
Yep, the RFC gives this option.
Does Juniper MX/ACX series support it?
I know for sure Cisco doesn't.

--
Tassos



Re: interesting troubleshooting

2020-03-21 Thread Christopher Morrow
(skipping up the thread some)

On Fri, Mar 20, 2020 at 5:58 PM Jared Mauch  wrote:
> It’s the protocol 50 IPSEC VPNs.  They are very sensitive to path changes and 
> reordering as well.
>
> If you’re tunneling more than 5 or 10Gb/s of IPSEC it’s likely going to be a 
> bad day when you find a low speed link in the middle.  Generally providers 
> with these types of flows have both sides on the same network vs going 
> off-net as they’re not stable on peering links that might change paths.

a bunch of times the advice given to folk in this situation is: "Add
more entropy", which really for ipsec/gre/etc vpns means more
endpoints.
For instance, adding 3 more ips on either side for tunnel
egress/ingress will make the flows (ideally) smaller and more probable
to hash across different links in the intermediary network(s).  This
also moves the loadbalancing back behind the customer prem so ideally
perhaps even the nxM flows are now balanced a little better as well.

sometimes this works, sometimes it's hard to accomplish :(


Re: interesting troubleshooting

2020-03-21 Thread Saku Ytti
On Sat, 21 Mar 2020 at 18:55, Tassos Chatzithomaoglou
 wrote:

> I still don't understand why the vendors cannot make it work in one direction 
> only (the low-end platform would only need to remove an extra label, no need 
> to inspect traffic).
> That would help us a lot, since the majority of our traffic is downstream to 
> the customer.

It is signalled separately for TX and RX and some vendors do allow you
to signal it separately.

-- 
  ++ytti


Re: interesting troubleshooting

2020-03-21 Thread Tassos Chatzithomaoglou


Mark Tinka wrote on 21/3/20 18:15:
> So the three or four times we tried to get FAT going (in a multi-vendor
> network), it simply didn't work.
>
> Have you (or anyone else) had any luck with it, in practice?
>
> Mark.
>

Only between Cisco boxes.

I still don't understand why the vendors cannot make it work in one direction 
only (the low-end platform would only need to remove an extra label, no need to 
inspect traffic).
That would help us a lot, since the majority of our traffic is downstream to 
the customer.


--
Tassos



Re: interesting troubleshooting

2020-03-21 Thread Saku Ytti
On Sat, 21 Mar 2020 at 18:19, Mark Tinka  wrote:

> So the three or four times we tried to get FAT going (in a multi-vendor
> network), it simply didn't work.

Yeah we run it in a multivendor network (JNPR, CSCO, NOK), works.

I would also recommend people exclusively using CW+FAT and disabling
LSR payload heuristics (JNPR default, but by default won't do with CW,
can do with CW too).

-- 
  ++ytti


Re: interesting troubleshooting

2020-03-21 Thread Mark Tinka



On 21/Mar/20 09:58, Saku Ytti wrote:


> No.
>
> FAT adds additional MPLS label for entropy, ingressPE calculates flow
> hash, based on traditional flow keys and injects that flow number as
> MPLS label, so transit LSR can use MPLS labels for balancing, without
> being able to parse the frame. Similarly VPN provider could do that,
> and inject that flow hash as SPORT at the time of tunneling, by
> looking at the inside packet. And any defensive VPN provider should do
> this, as it would be a competitive advantage.
> Now for some vendors, like Juniper and Nokia transit LSR can look
> inside pseudowire L3 packet for flow keys, so you don't even need FAT
> for this. Some other like ASR9k cannot, and you'll need FAT for it.
>
> But all of this requires that there is entropy to use, if it's truly
> just single fat flow, then you won't balance it. Then you have to
> create bias to the hashResult=>egressInt table, which by default is
> fair, each egressInt has same amount of hashResults, for elephant
> flows you want the congested egressInt to be mapped to fewer amount of
> hashResults.

So the three or four times we tried to get FAT going (in a multi-vendor
network), it simply didn't work.

Have you (or anyone else) had any luck with it, in practice?

Mark.


Re: interesting troubleshooting

2020-03-21 Thread Saku Ytti
On Sat, 21 Mar 2020 at 04:20, Steve Meuse  wrote:

> What that large flow in a single LSP? Is this something that FAT lsp would 
> fix?

No.

FAT adds additional MPLS label for entropy, ingressPE calculates flow
hash, based on traditional flow keys and injects that flow number as
MPLS label, so transit LSR can use MPLS labels for balancing, without
being able to parse the frame. Similarly VPN provider could do that,
and inject that flow hash as SPORT at the time of tunneling, by
looking at the inside packet. And any defensive VPN provider should do
this, as it would be a competitive advantage.
Now for some vendors, like Juniper and Nokia transit LSR can look
inside pseudowire L3 packet for flow keys, so you don't even need FAT
for this. Some other like ASR9k cannot, and you'll need FAT for it.

But all of this requires that there is entropy to use, if it's truly
just single fat flow, then you won't balance it. Then you have to
create bias to the hashResult=>egressInt table, which by default is
fair, each egressInt has same amount of hashResults, for elephant
flows you want the congested egressInt to be mapped to fewer amount of
hashResults.

-- 
  ++ytti


Re: interesting troubleshooting

2020-03-21 Thread Saku Ytti
Hey Matthew,

> There are *several* caveats to doing dynamic monitoring and remapping of
> flows; one of the biggest challenges is that it puts extra demands on the
> line cards tracking the flows, especially as the number of flows rises to
> large values.  I recommend reading
> https://www.juniper.net/documentation/en_US/junos/topics/topic-map/load-balancing-aggregated-ethernet-interfaces.html#id-understanding-aggregated-ethernet-load-balancing
> before configuring it.

You are confusing two features. Stateful and adaptive. I was proposing
adaptive, which just remaps the table, which is free, it is not flow
aware. Amount of flow results is very small bound number, amount of
states is very large unbound number.

-- 
  ++ytti


Re: interesting troubleshooting

2020-03-20 Thread William Herrin
On Fri, Mar 20, 2020 at 3:07 PM Job Snijders  wrote:
> Do we know which specific VPN technologies specifically are harder to
> hash in a meaningful way for load balanacing purposes, than others?

I would expect it to be true of any site to site VPN data flow. The
whole idea is for the guy in the middle to be unable to deduce
anything about the flow. If the technology provides hints about which
packets match the same subflow, it isn't doing a very good job.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: interesting troubleshooting

2020-03-20 Thread Steve Meuse
What that large flow in a single LSP? Is this something that FAT lsp would
fix?

-Steve


On Fri, Mar 20, 2020 at 5:33 PM Nimrod Levy  wrote:

> I just ran into an issue that I thought was worth sharing with the NANOG
> community. With recently increased visibility on keeping the Internet
> running smoothly, I thought that sharing this small experience could
> benefit everyone.
>
> I was contacted by my NOC to investigate a LAG that was not distributing
> traffic evenly among the members to the point where one member was
> congested while the utilization on the LAG was reasonably low. Looking at
> my netflow data, I was able to confirm that this was caused by a single
> large flow of ESP traffic. Fortunately, I was able to shift this flow to
> another path that had enough headroom available so that the flow could be
> accommodated on a single member link.
>
> With the increase in remote workers and VPN traffic that won't hash across
> multiple paths, I thought this anecdote might help someone else track down
> a problem that might not be so obvious.
>
> Please take this message in the spirit in which it was intended and
> refrain from the snarky "just upgrade you links" comments.
>
> --
> Nimrod
>


Re: interesting troubleshooting

2020-03-20 Thread Job Snijders
On Fri, Mar 20, 2020 at 05:57:19PM -0400, Jared Mauch wrote:
> You also need to watch out to ensure you’re not on some L2VPN type
> product that bumps up against a barrier.  I know it’s a stressful time
> for many networks and systems people as traffic shifts. 

A few years ago we did a presentation about what can happen if hashing
for load balancing purposes doesn't work well (be it either IP or L2VPN
traffic). I think some of the information is still relevant as there
really isn't much difference between the problem existing in the
underlay network's implementation of algorithms or the properties of the
enveloppe that encompasses the overlay network packet.

video of younger job + jeff: https://www.youtube.com/watch?v=cXSwoKu9zOg
slides: 
https://archive.nanog.org/meetings/nanog57/presentations/Tuesday/tues.general.SnijdersWheeler.MACaddresses.14.pdf

Kind regards,

Job


Re: interesting troubleshooting

2020-03-20 Thread Matthew Petach
On Fri, Mar 20, 2020 at 3:09 PM Saku Ytti  wrote:

> Hey Nimrod,
>
> > I was contacted by my NOC to investigate a LAG that was not distributing
> traffic evenly among the members to the point where one member was
> congested while the utilization on the LAG was reasonably low. Looking at
> my netflow data, I was able to confirm that this was caused by a single
> large flow of ESP traffic. Fortunately, I was able to shift this flow to
> another path that had enough headroom available so that the flow could be
> accommodated on a single member link.
> >
> > With the increase in remote workers and VPN traffic that won't hash
> across multiple paths, I thought this anecdote might help someone else
> track down a problem that might not be so obvious.
>
> This problem is called elephant flow. Some vendors have solution for
> this, by dynamically monitoring utilisation and remapping the
> hashResult => egressInt table to create bias to offset the elephant
> flow.
>
> One particular example:
>
> https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/adaptive-edit-interfaces-aex-aggregated-ether-options-load-balance.html
>
> Ideally VPN providers would be defensive and would use SPORT for
> entropy, like MPLSoUDP does.
>
> --
>   ++ytti
>
>

There are *several* caveats to doing dynamic monitoring and remapping of
flows; one of the biggest challenges is that it puts extra demands on the
line cards tracking the flows, especially as the number of flows rises to
large values.  I recommend reading
https://www.juniper.net/documentation/en_US/junos/topics/topic-map/load-balancing-aggregated-ethernet-interfaces.html#id-understanding-aggregated-ethernet-load-balancing
before configuring it.

"Although the feature performance is high, it consumes significant amount
of line card memory. Approximately, 4000 logical interfaces or 16
aggregated Ethernet logical interfaces can have this feature enabled on
supported MPCs. However, when the Packet Forwarding Engine hardware memory
is low, depending upon the available memory, it falls back to the default
load balancing mechanism."

What is that old saying?

Oh, right--There Ain't No Such Thing As A Free Lunch.   ^_^;;

Matt


Re: interesting troubleshooting

2020-03-20 Thread Chris Adams
Once upon a time, Nimrod Levy  said:
> With the increase in remote workers and VPN traffic that won't hash across
> multiple paths, I thought this anecdote might help someone else track down
> a problem that might not be so obvious.

Last week I ran into an issue where traffic between my home and work
networks had high latency, but only to certain IPs (even different IPs
on the same server).  Since my work network peers with my home provider,
I was able to go to the provider's NOC, and they were very helpful (they
ended up turning up more bandwidth).  I expect this was also a case of
one LAG member being congested, and my problem IP pairs were hashing to
that member.

My traffic wasn't VPN (SSH, with ping/mtr for testing), but it is
possible that somebody else's was - I didn't get detailed with the other
NOC.
-- 
Chris Adams 


Re: interesting troubleshooting

2020-03-20 Thread Saku Ytti
Hey Nimrod,

> I was contacted by my NOC to investigate a LAG that was not distributing 
> traffic evenly among the members to the point where one member was congested 
> while the utilization on the LAG was reasonably low. Looking at my netflow 
> data, I was able to confirm that this was caused by a single large flow of 
> ESP traffic. Fortunately, I was able to shift this flow to another path that 
> had enough headroom available so that the flow could be accommodated on a 
> single member link.
>
> With the increase in remote workers and VPN traffic that won't hash across 
> multiple paths, I thought this anecdote might help someone else track down a 
> problem that might not be so obvious.

This problem is called elephant flow. Some vendors have solution for
this, by dynamically monitoring utilisation and remapping the
hashResult => egressInt table to create bias to offset the elephant
flow.

One particular example:
https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/adaptive-edit-interfaces-aex-aggregated-ether-options-load-balance.html

Ideally VPN providers would be defensive and would use SPORT for
entropy, like MPLSoUDP does.

-- 
  ++ytti


Re: interesting troubleshooting

2020-03-20 Thread Jared Mauch



> On Mar 20, 2020, at 5:50 PM, Job Snijders  wrote:
> 
> On Fri, Mar 20, 2020 at 05:33:31PM -0400, Nimrod Levy wrote:
>> With the increase in remote workers and VPN traffic that won't hash across
>> multiple paths, I thought this anecdote might help someone else track down
>> a problem that might not be so obvious.
> 
> Do we know which specific VPN technologies specifically are harder to
> hash in a meaningful way for load balanacing purposes, than others?
> 
> If the outcome of this troubleshooting is a list of recommendations
> about which VPN approaches to use, and which ones to avoid (because of
> the issue you described), that'll be a great outcome.
> 

It’s the protocol 50 IPSEC VPNs.  They are very sensitive to path changes and 
reordering as well.

If you’re tunneling more than 5 or 10Gb/s of IPSEC it’s likely going to be a 
bad day when you find a low speed link in the middle.  Generally providers with 
these types of flows have both sides on the same network vs going off-net as 
they’re not stable on peering links that might change paths.

You also need to watch out to ensure you’re not on some L2VPN type product that 
bumps up against a barrier.  I know it’s a stressful time for many networks and 
systems people as traffic shifts.  Good luck out there!

- Jared



Re: interesting troubleshooting

2020-03-20 Thread Job Snijders
On Fri, Mar 20, 2020 at 05:33:31PM -0400, Nimrod Levy wrote:
> With the increase in remote workers and VPN traffic that won't hash across
> multiple paths, I thought this anecdote might help someone else track down
> a problem that might not be so obvious.

Do we know which specific VPN technologies specifically are harder to
hash in a meaningful way for load balanacing purposes, than others?

If the outcome of this troubleshooting is a list of recommendations
about which VPN approaches to use, and which ones to avoid (because of
the issue you described), that'll be a great outcome.

Kind regards,

Job


interesting troubleshooting

2020-03-20 Thread Nimrod Levy
I just ran into an issue that I thought was worth sharing with the NANOG
community. With recently increased visibility on keeping the Internet
running smoothly, I thought that sharing this small experience could
benefit everyone.

I was contacted by my NOC to investigate a LAG that was not distributing
traffic evenly among the members to the point where one member was
congested while the utilization on the LAG was reasonably low. Looking at
my netflow data, I was able to confirm that this was caused by a single
large flow of ESP traffic. Fortunately, I was able to shift this flow to
another path that had enough headroom available so that the flow could be
accommodated on a single member link.

With the increase in remote workers and VPN traffic that won't hash across
multiple paths, I thought this anecdote might help someone else track down
a problem that might not be so obvious.

Please take this message in the spirit in which it was intended and refrain
from the snarky "just upgrade you links" comments.

-- 
Nimrod