RE: Anyone from Level3/CenturyLink/Lumen, possibly Comcast around?

2021-10-15 Thread John Gawf
I have multiple clients with the same problem involving tunnels terminated on 
Lumen to terminations on Comcast.  The problems started on Monday morning, 
10/11/21, all in the Denver metro area.   I was able to change a tunnel from 
IPSec-ESP to IPSec-HA and the problem went away and then changing back to ESP 
and the loss came back.  We are also seeing problem with IPSec remote VPN 
clients from Comcast networks terminated to firewalls with Lumen interfaces 
having drops and the tunnel is unusable.

We have opened tickets at Lumen, but they get closed because a traceroute shows 
no drops and they say the problem is with the "application".

John Gawf
john.g...@sajens.com<mailto:john.g...@sajens.com>
Saje Network Systems
3775 Iris Ave Suite 2C Boulder CO 80301
http://sajens.com<http://sajens.com/>
303.909.6247


From: NANOG  On Behalf Of 
Pennington, Scott
Sent: Friday, October 15, 2021 10:48 AM
To: nanog@nanog.org
Subject: Re: Anyone from Level3/CenturyLink/Lumen, possibly Comcast around?

We have also seen the same behavior of intermittent customer complaint followed 
by issue resolving spontaneously.   Our end customer has a tunnel to a supplier 
on Comcast with a return path via Lumen.   The first ticket was opened on 9/23, 
has cleared and returned a few times.  End customer worked with Cisco TAC and 
symptom is up to 30% ESP packet drop for periods from 5 minutes to 1 hour when 
the problem is active.


From: NANOG 
mailto:nanog-bounces+scott.pennington=cinbell@nanog.org>>
 on behalf of Mike Lewinski via NANOG mailto:nanog@nanog.org>>
Sent: Thursday, October 14, 2021 2:30 PM
To: Brie mailto:br...@2mbit.com>>; 
nanog@nanog.org<mailto:nanog@nanog.org> 
mailto:nanog@nanog.org>>
Subject: RE: Anyone from Level3/CenturyLink/Lumen, possibly Comcast around?

I can confirm this issue exists at several sites in the Denver area with this 
same IPSEC issue, all routing between Level3/Lumen and Comcast.

I was told by one customer that it resolved late yesterday afternoon but I 
haven't been able to confirm that.


Mike

-Original Message-
From: NANOG 
mailto:nanog-bounces+mlewinski=massivenetworks@nanog.org>>
 On Behalf Of Brie
Sent: Thursday, October 14, 2021 10:43 AM
To: nanog@nanog.org<mailto:nanog@nanog.org>
Subject: Anyone from Level3/CenturyLink/Lumen, possibly Comcast around?

Hi all,

So, having a...  frustrating issue going on.  Long wall of text ahead as I 
explain.

1 x CenturyLink/Lumen fiber in Boise
1 x CenturyLink/Lumen fiber in Cheyenne
1 x Comcast biz fiber in Denver

IPsec VPN tunnels between all three sites, w/ OSPF for routing failover (which 
unfortunately doesn't help in this situation).

Two days ago, Cheyenne to Denver (.196) traffic (both tcp and udp) were an 
issue initially.  Failed over to routing Cheyenne VPN through Boise while we 
opened ticket with CL.

Yesterday, Boise to Denver (.196) traffic started having exact same issue.

Tests from another CL fiber in Boise (my own circuit, with legacy IP space and 
BGP) to Denver (.196) did not show same issues.  Path appeared clean.

Traceroutes from Office Boise to Denver (.196) had a noticeable difference from 
Personal Boise to Denver (.196):

Office Boise -> Denver (.196)
--
3: sea-edge-15.inet.qwest.net
4: lag-4.ear3.Seattle1.Level3.net
5: ae-2-52.ear2.seattle1.level3.net   <--  This hop
6: be-203-pe01.seattle.wa.ibone.comcast.net


Personal Boise -> Denver (.196)
--
4: sea-edge-15.inet.qwest.net
5: lag-25.ear2.Seattle1.Level3.net
6: be-203-pe01.seattle.wa.ibone.comcast.net

On a whim, tracerouted to another Denver router IP address (.199, IP alias on 
same interface) from Boise Office, and traceroute matched the traceroute from 
Personal Boise to Denver (.196) traceroute.

No packet loss.


Swapped VPN tunnels over to using another ip on same router (.199), same 
interface physical and logical, in Denver, and VPN was working again normally.

This morning though, Cheyenne to Denver (.199) is having problems, while Boise 
to Denver (.199) isn't (for now).

Already spent most of last night working with my partner in Denver replacing 
nearly everything on the Denver side with no change.

Tests from the router above the main Denver VPN endpoint (.196) do not show any 
kind of issues or packet loss to it, so it doesn't appear the problem is inside 
of our network.

I'm not inclined to think this is a Comcast issue, but I can't be sure.

This sounds almost like a load balancing hashing issue, with one link in the 
bond group having issues, somewhere in one of our upstream's networks.

We'll be opening a ticket in a bit through normal channels with 
CenturyLink/Lumen, but we're worried they're just going to close the ticket as 
not being their issue.

Anyone know of an engineer at CenturyLink/Lumen/Level3 or even Comcast that 
might want to take a stab at this?  I can provid

Re: Anyone from Level3/CenturyLink/Lumen, possibly Comcast around?

2021-10-15 Thread Pennington, Scott
We have also seen the same behavior of intermittent customer complaint followed 
by issue resolving spontaneously.   Our end customer has a tunnel to a supplier 
on Comcast with a return path via Lumen.   The first ticket was opened on 9/23, 
has cleared and returned a few times.  End customer worked with Cisco TAC and 
symptom is up to 30% ESP packet drop for periods from 5 minutes to 1 hour when 
the problem is active.


From: NANOG  on behalf of 
Mike Lewinski via NANOG 
Sent: Thursday, October 14, 2021 2:30 PM
To: Brie ; nanog@nanog.org 
Subject: RE: Anyone from Level3/CenturyLink/Lumen, possibly Comcast around?

I can confirm this issue exists at several sites in the Denver area with this 
same IPSEC issue, all routing between Level3/Lumen and Comcast.

I was told by one customer that it resolved late yesterday afternoon but I 
haven't been able to confirm that.


Mike

-Original Message-
From: NANOG  On Behalf 
Of Brie
Sent: Thursday, October 14, 2021 10:43 AM
To: nanog@nanog.org
Subject: Anyone from Level3/CenturyLink/Lumen, possibly Comcast around?

Hi all,

So, having a...  frustrating issue going on.  Long wall of text ahead as I 
explain.

1 x CenturyLink/Lumen fiber in Boise
1 x CenturyLink/Lumen fiber in Cheyenne
1 x Comcast biz fiber in Denver

IPsec VPN tunnels between all three sites, w/ OSPF for routing failover (which 
unfortunately doesn't help in this situation).

Two days ago, Cheyenne to Denver (.196) traffic (both tcp and udp) were an 
issue initially.  Failed over to routing Cheyenne VPN through Boise while we 
opened ticket with CL.

Yesterday, Boise to Denver (.196) traffic started having exact same issue.

Tests from another CL fiber in Boise (my own circuit, with legacy IP space and 
BGP) to Denver (.196) did not show same issues.  Path appeared clean.

Traceroutes from Office Boise to Denver (.196) had a noticeable difference from 
Personal Boise to Denver (.196):

Office Boise -> Denver (.196)
--
3: sea-edge-15.inet.qwest.net
4: lag-4.ear3.Seattle1.Level3.net
5: ae-2-52.ear2.seattle1.level3.net   <--  This hop
6: be-203-pe01.seattle.wa.ibone.comcast.net


Personal Boise -> Denver (.196)
--
4: sea-edge-15.inet.qwest.net
5: lag-25.ear2.Seattle1.Level3.net
6: be-203-pe01.seattle.wa.ibone.comcast.net

On a whim, tracerouted to another Denver router IP address (.199, IP alias on 
same interface) from Boise Office, and traceroute matched the traceroute from 
Personal Boise to Denver (.196) traceroute.

No packet loss.


Swapped VPN tunnels over to using another ip on same router (.199), same 
interface physical and logical, in Denver, and VPN was working again normally.

This morning though, Cheyenne to Denver (.199) is having problems, while Boise 
to Denver (.199) isn't (for now).

Already spent most of last night working with my partner in Denver replacing 
nearly everything on the Denver side with no change.

Tests from the router above the main Denver VPN endpoint (.196) do not show any 
kind of issues or packet loss to it, so it doesn't appear the problem is inside 
of our network.

I'm not inclined to think this is a Comcast issue, but I can't be sure.

This sounds almost like a load balancing hashing issue, with one link in the 
bond group having issues, somewhere in one of our upstream's networks.

We'll be opening a ticket in a bit through normal channels with 
CenturyLink/Lumen, but we're worried they're just going to close the ticket as 
not being their issue.

Anyone know of an engineer at CenturyLink/Lumen/Level3 or even Comcast that 
might want to take a stab at this?  I can provide a lot more detail.

--
Brielle Bruns
The Summit Open Source Development Group
http://www.sosdg.org/ http://www.ahbl.org


RE: Anyone from Level3/CenturyLink/Lumen, possibly Comcast around?

2021-10-14 Thread Mike Lewinski via NANOG
I can confirm this issue exists at several sites in the Denver area with this 
same IPSEC issue, all routing between Level3/Lumen and Comcast.

I was told by one customer that it resolved late yesterday afternoon but I 
haven't been able to confirm that.


Mike

-Original Message-
From: NANOG  On Behalf 
Of Brie
Sent: Thursday, October 14, 2021 10:43 AM
To: nanog@nanog.org
Subject: Anyone from Level3/CenturyLink/Lumen, possibly Comcast around?

Hi all,

So, having a...  frustrating issue going on.  Long wall of text ahead as I 
explain.

1 x CenturyLink/Lumen fiber in Boise
1 x CenturyLink/Lumen fiber in Cheyenne
1 x Comcast biz fiber in Denver

IPsec VPN tunnels between all three sites, w/ OSPF for routing failover (which 
unfortunately doesn't help in this situation).

Two days ago, Cheyenne to Denver (.196) traffic (both tcp and udp) were an 
issue initially.  Failed over to routing Cheyenne VPN through Boise while we 
opened ticket with CL.

Yesterday, Boise to Denver (.196) traffic started having exact same issue.

Tests from another CL fiber in Boise (my own circuit, with legacy IP space and 
BGP) to Denver (.196) did not show same issues.  Path appeared clean.

Traceroutes from Office Boise to Denver (.196) had a noticeable difference from 
Personal Boise to Denver (.196):

Office Boise -> Denver (.196)
--
3: sea-edge-15.inet.qwest.net
4: lag-4.ear3.Seattle1.Level3.net
5: ae-2-52.ear2.seattle1.level3.net   <--  This hop
6: be-203-pe01.seattle.wa.ibone.comcast.net


Personal Boise -> Denver (.196)
--
4: sea-edge-15.inet.qwest.net
5: lag-25.ear2.Seattle1.Level3.net
6: be-203-pe01.seattle.wa.ibone.comcast.net

On a whim, tracerouted to another Denver router IP address (.199, IP alias on 
same interface) from Boise Office, and traceroute matched the traceroute from 
Personal Boise to Denver (.196) traceroute.

No packet loss.


Swapped VPN tunnels over to using another ip on same router (.199), same 
interface physical and logical, in Denver, and VPN was working again normally.

This morning though, Cheyenne to Denver (.199) is having problems, while Boise 
to Denver (.199) isn't (for now).

Already spent most of last night working with my partner in Denver replacing 
nearly everything on the Denver side with no change.

Tests from the router above the main Denver VPN endpoint (.196) do not show any 
kind of issues or packet loss to it, so it doesn't appear the problem is inside 
of our network.

I'm not inclined to think this is a Comcast issue, but I can't be sure.

This sounds almost like a load balancing hashing issue, with one link in the 
bond group having issues, somewhere in one of our upstream's networks.

We'll be opening a ticket in a bit through normal channels with 
CenturyLink/Lumen, but we're worried they're just going to close the ticket as 
not being their issue.

Anyone know of an engineer at CenturyLink/Lumen/Level3 or even Comcast that 
might want to take a stab at this?  I can provide a lot more detail.

--
Brielle Bruns
The Summit Open Source Development Group
http://www.sosdg.org/ http://www.ahbl.org