Re: BGP and The zero window edge

2021-04-25 Thread Alarig Le Lay
On Thu 22 Apr 2021 01:24:54 GMT, Job Snijders via NANOG wrote:
> One example is 
> http://lg.ring.nlnog.net/prefix_detail/lg01/ipv6?q=2a0b:6b86:d15::/48
> 
> 2a0b:6b86:d15::/48 via:
> BGP.as_path: 204092 57199 35280 6939 42615 42615 212232
> BGP.as_path: 208627 207910 57199 35280 6939 42615 42615 212232
> BGP.as_path: 208627 207910 57199 35280 6939 42615 42615 212232
> (first announced April 15th, last withdrawn April 15th, 2021)

On the AS204092 side, the route is one week and two days old (so
2021-04-16). So we never received the withdrawn.

asbr01#sh bgp ipv6 uni 2a0b:6b86:d15::/48
BGP routing table entry for 2A0B:6B86:D15::/48, version 88407242
BGP Bestpath: deterministic-med: med
Paths: (2 available, best #1, table default)
  Advertised to update-groups:
 129130145167
  Refresh Epoch 1
  57199 35280 6939 42615 42615 212232
2A0B:CBC0:1::BD (FE80::66D1:54FF:FEEF:9893) from 2A0B:CBC0:1::BD 
(80.67.167.5)
  Origin IGP, metric 10, localpref 100, valid, external, best
  Community: 24115:6939 35280:10 35280:1040 35280:2080 35280:3120 
35280:2 35280:21000 35280:21150 57199:35280 57199:65535 64496:100 
64496:57199 64999:24115
  unknown transitive attribute: flag 0xE0 type 0x20 length 0x30
value  5E33  03E9  0001  5E33
   03EA  0002  5E33  03EB
   0005  5E33  03EC  1B1B

  path 7F1E8D0F3B58 RPKI State valid
  rx pathid: 0, tx pathid: 0x0
  Refresh Epoch 1
  57199 35280 6939 42615 42615 212232, (received-only)
2A0B:CBC0:1::BD (FE80::66D1:54FF:FEEF:9893) from 2A0B:CBC0:1::BD 
(80.67.167.5)
  Origin IGP, metric 4294967295, localpref 100, valid, external
  Community: 24115:6939 35280:10 35280:1040 35280:2080 35280:3120 
35280:2 35280:21000 35280:21150 57199:35280 57199:65535 64999:24115
  unknown transitive attribute: flag 0xE0 type 0x20 length 0x30
value  5E33  03E9  0001  5E33
   03EA  0002  5E33  03EB
   0005  5E33  03EC  1B1B

  path 7F1E8D0EF088 RPKI State valid
  rx pathid: 0, tx pathid: 0
asbr01#sh ipv6 route 2a0b:6b86:d15::/48
Routing entry for 2A0B:6B86:D15::/48
  Known via "bgp 204092", distance 20, metric 10, type external
  Route count is 1/1, share count 0
  Routing paths:
FE80::66D1:54FF:FEEF:9893, GigabitEthernet0/0/0.24
  MPLS label: nolabel
  Last updated 1w2d ago

asbr01#

-- 
Alarig


Re: BGP and The zero window edge

2021-04-24 Thread Simon Leinen
Job Snijders via NANOG writes:
> *RIGHT NOW* (at the moment of writing), there are a number of zombie
> route visible in the IPv6 Default-Free Zone:

[Reversing the order of your two examples]

> Another one is 
> http://lg.ring.nlnog.net/prefix_detail/lg01/ipv6?q=2a0b:6b86:d24::/48

> 2a0b:6b86:d24::/48 via:
> BGP.as_path: 201701 9002 6939 42615 212232
> BGP.as_path: 34927 9002 6939 42615 212232
> BGP.as_path: 207960 34927 9002 6939 42615 212232
> BGP.as_path: 44103 50673 9002 6939 42615 212232
> BGP.as_path: 208627 207910 34927 9002 6939 42615 212232
> BGP.as_path: 3280 34927 9002 6939 42615 212232
> BGP.as_path: 206628 34927 9002 6939 42615 212232
> BGP.as_path: 208627 207910 34927 9002 6939 42615 212232
> (first announced March 24th, last withdrawn March 24th, 2021)

So that one was resolved at AS9002, see Alexandre's followup (thanks!)

AS9002 had also been my guess when I read this, because it's the
leftmost common AS in the paths observed.

> One example is 
> http://lg.ring.nlnog.net/prefix_detail/lg01/ipv6?q=2a0b:6b86:d15::/48

> 2a0b:6b86:d15::/48 via:
> BGP.as_path: 204092 57199 35280 6939 42615 42615 212232
> BGP.as_path: 208627 207910 57199 35280 6939 42615 42615 212232
> BGP.as_path: 208627 207910 57199 35280 6939 42615 42615 212232
> (first announced April 15th, last withdrawn April 15th, 2021)

Applying the same logic, I'd suspect that the withdrawal is stuck in
AS57199 in this case.  I'll try to contact them.

Here's a (partial) RIPE RIS BGPlay view of the last lifecycle of the
2a0b:6b86:d15::/48 beacon:

https://stat.ripe.net/widget/bgplay#w.resource=2a0b:6b86:d15::/48=true=1618444740=1618542000=0,1,2,4,10,12,20,21=null=bgp

Cheers,
-- 
Simon.


Re: BGP and The zero window edge

2021-04-22 Thread Alexandre Snarskii
On Thu, Apr 22, 2021 at 01:24:54AM +0200, Job Snijders via NANOG wrote:
[...]
> 
> Another one is 
> http://lg.ring.nlnog.net/prefix_detail/lg01/ipv6?q=2a0b:6b86:d24::/48
> 
> 2a0b:6b86:d24::/48 via:
> BGP.as_path: 201701 9002 6939 42615 212232
> BGP.as_path: 34927 9002 6939 42615 212232
> BGP.as_path: 207960 34927 9002 6939 42615 212232
> BGP.as_path: 44103 50673 9002 6939 42615 212232
> BGP.as_path: 208627 207910 34927 9002 6939 42615 212232
> BGP.as_path: 3280 34927 9002 6939 42615 212232
> BGP.as_path: 206628 34927 9002 6939 42615 212232
> BGP.as_path: 208627 207910 34927 9002 6939 42615 212232
> (first announced March 24th, last withdrawn March 24th, 2021)
[...]
> 
> I checked the AS 6939 Looking glass, but the d24::/48 route is not
> visible in the http://lg.he.net/ web interface. This leads me to believe
> the the route got stuck somewhere along way in either of 201701, 204092,
> 206628, 207910, 207960, 208627, 3280, 34927, 35280, 44103, 50673, 57199,
> and/or 9002.

9002. Hit by Juniper PR1562090, route stuck in DeletePending..
Workaround applied, sessions with 6939 restarted, route is gone.



Re: BGP and The zero window edge

2021-04-22 Thread Job Snijders via NANOG
On Thu, Apr 22, 2021 at 02:29:31PM +0300, Alexandre Snarskii wrote:
> 9002. Hit by Juniper PR1562090, route stuck in DeletePending..
> Workaround applied, sessions with 6939 restarted, route is gone.

Thank you for the details and clearing the issue.

Kind regards,

Job


Re: BGP and The zero window edge

2021-04-21 Thread Hank Nussbacher

On 22/04/2021 02:24, Job Snijders via NANOG wrote:

On Wed, Apr 21, 2021 at 09:22:57PM +, Jakob Heitz (jheitz) wrote:

I'd like to get some data on what actually happened in the real cases
and analyze it.

[snip]

TCP zero window is possible, but many other things could
cause it too.


Indeed. There could be a number of reasons that caused it.

Switchings away from TCP win=0 towards "Zombie Routes":

*RIGHT NOW* (at the moment of writing), there are a number of zombie
route visible in the IPv6 Default-Free Zone:

One example is 
http://lg.ring.nlnog.net/prefix_detail/lg01/ipv6?q=2a0b:6b86:d15::/48

 2a0b:6b86:d15::/48 via:
 BGP.as_path: 204092 57199 35280 6939 42615 42615 212232
 BGP.as_path: 208627 207910 57199 35280 6939 42615 42615 212232
 BGP.as_path: 208627 207910 57199 35280 6939 42615 42615 212232
 (first announced April 15th, last withdrawn April 15th, 2021)
 
Another one is http://lg.ring.nlnog.net/prefix_detail/lg01/ipv6?q=2a0b:6b86:d24::/48


 2a0b:6b86:d24::/48 via:
 BGP.as_path: 201701 9002 6939 42615 212232
 BGP.as_path: 34927 9002 6939 42615 212232
 BGP.as_path: 207960 34927 9002 6939 42615 212232
 BGP.as_path: 44103 50673 9002 6939 42615 212232
 BGP.as_path: 208627 207910 34927 9002 6939 42615 212232
 BGP.as_path: 3280 34927 9002 6939 42615 212232
 BGP.as_path: 206628 34927 9002 6939 42615 212232
 BGP.as_path: 208627 207910 34927 9002 6939 42615 212232
 (first announced March 24th, last withdrawn March 24th, 2021)

Just now, I literally rebooted the BGP speaker behind lg.ring.nlnog.net
to make ensure that those routes are not stuck in the BGP looking glass
itself.

2a0b:6b86:d24::/48 was first announced on March 24th, 2021, and
withdrawn at the end of March 24th, 2021 by the originator, and now
almost a month later, this prefix still is visible in the default-free
zone despite WITHDRAW messages having been sent and the AS 212232
operator confirming they are not announcing that IP prefix anywhere.

I checked the AS 6939 Looking glass, but the d24::/48 route is not
visible in the http://lg.he.net/ web interface. This leads me to believe
the the route got stuck somewhere along way in either of 201701, 204092,
206628, 207910, 207960, 208627, 3280, 34927, 35280, 44103, 50673, 57199,
and/or 9002.

This implies indeed might be multiple reasons a BGP route gets stuck
('stuck' as in - a WITHDRAW was not generated, or ignored). Perhaps on
any one of these edges there is a very high Out Queue for one reason or
another:

 34927 9002
 206628 34927
 44103 50673
 207960 34927
 3280 34927
 9002 6939
 201701 9002
 208627 207910

I'm not sure all the these sightings of stuck routes can be pinpointed
to one specific BGP vendor (or one bug).


I would guess that all the stuck route sightings manifest from one 
undiscovered TCP library bug that some BGP vendors are all commonly using.


-Hank




Kind regards,

Job





RE: BGP and The zero window edge

2021-04-21 Thread Philip Loenneker
I'm not sure if this is helpful to this discussion or not, but I recently 
became aware of a bug in a virtual router using DPDK+VPP which sounds like it 
could possibly produce a similar issue to what is being described, without the 
TCP window being a factor.

The system used the same process to read and process the messages coming in to 
the netlink socket. When a large BGP update was being processed it was possible 
that the netlink buffer was being filled while previous updates were being 
processed. This caused some route updates to not be processed, not applied to 
the VPP FIB, and so they became stuck. The particular vendor I spoke to about 
this issue resolved this by giving priority to reading and storing the messages 
for processing, and asynchronously processing those messages in batches. 

I can share additional details off-list if anyone thinks this could be related 
to the problem.

-Original Message-
From: NANOG  On 
Behalf Of Job Snijders via NANOG
Sent: Thursday, 22 April 2021 9:25 AM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org
Subject: Re: BGP and The zero window edge

On Wed, Apr 21, 2021 at 09:22:57PM +, Jakob Heitz (jheitz) wrote:
> I'd like to get some data on what actually happened in the real cases 
> and analyze it.
>
> [snip]
> 
> TCP zero window is possible, but many other things could cause it too.

Indeed. There could be a number of reasons that caused it.

Switchings away from TCP win=0 towards "Zombie Routes":

*RIGHT NOW* (at the moment of writing), there are a number of zombie route 
visible in the IPv6 Default-Free Zone:

One example is 
https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flg.ring.nlnog.net%2Fprefix_detail%2Flg01%2Fipv6%3Fq%3D2a0b%3A6b86%3Ad15%3A%3A%2F48data=04%7C01%7Cphilip.loenneker%40tasmanet.com.au%7C054f1c15d7534f2e671c08d9051d4626%7Cb53dc580ab7847208b30536f36d398ac%7C0%7C0%7C637546445559391894%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=ckoULXFPBZnMqFWIwq87PwXJAPhevmIhIbk0ywq2ZMM%3Dreserved=0

2a0b:6b86:d15::/48 via:
BGP.as_path: 204092 57199 35280 6939 42615 42615 212232
BGP.as_path: 208627 207910 57199 35280 6939 42615 42615 212232
BGP.as_path: 208627 207910 57199 35280 6939 42615 42615 212232
(first announced April 15th, last withdrawn April 15th, 2021)

Another one is 
https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flg.ring.nlnog.net%2Fprefix_detail%2Flg01%2Fipv6%3Fq%3D2a0b%3A6b86%3Ad24%3A%3A%2F48data=04%7C01%7Cphilip.loenneker%40tasmanet.com.au%7C054f1c15d7534f2e671c08d9051d4626%7Cb53dc580ab7847208b30536f36d398ac%7C0%7C0%7C637546445559391894%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=E8gIINgqG6J5NO2VQQ9ledvXKJeUWzRc42EgUt9fm4k%3Dreserved=0

2a0b:6b86:d24::/48 via:
BGP.as_path: 201701 9002 6939 42615 212232
BGP.as_path: 34927 9002 6939 42615 212232
BGP.as_path: 207960 34927 9002 6939 42615 212232
BGP.as_path: 44103 50673 9002 6939 42615 212232
BGP.as_path: 208627 207910 34927 9002 6939 42615 212232
BGP.as_path: 3280 34927 9002 6939 42615 212232
BGP.as_path: 206628 34927 9002 6939 42615 212232
BGP.as_path: 208627 207910 34927 9002 6939 42615 212232
(first announced March 24th, last withdrawn March 24th, 2021)

Just now, I literally rebooted the BGP speaker behind lg.ring.nlnog.net to make 
ensure that those routes are not stuck in the BGP looking glass itself. 

2a0b:6b86:d24::/48 was first announced on March 24th, 2021, and withdrawn at 
the end of March 24th, 2021 by the originator, and now almost a month later, 
this prefix still is visible in the default-free zone despite WITHDRAW messages 
having been sent and the AS 212232 operator confirming they are not announcing 
that IP prefix anywhere.

I checked the AS 6939 Looking glass, but the d24::/48 route is not visible in 
the 
https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Flg.he.net%2Fdata=04%7C01%7Cphilip.loenneker%40tasmanet.com.au%7C054f1c15d7534f2e671c08d9051d4626%7Cb53dc580ab7847208b30536f36d398ac%7C0%7C0%7C637546445559391894%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=igVISlzWFPJK43%2FZtu%2FalxmtabPDq8d2H16JYmGyp6c%3Dreserved=0
 web interface. This leads me to believe the the route got stuck somewhere 
along way in either of 201701, 204092, 206628, 207910, 207960, 208627, 3280, 
34927, 35280, 44103, 50673, 57199, and/or 9002.

This implies indeed might be multiple reasons a BGP route gets stuck ('stuck' 
as in - a WITHDRAW was not generated, or ignored). Perhaps on any one of these 
edges there is a very high Out Queue for one reason or
another:

34927 9002
206628 34927
44103 50673
207960 34927
3280 34927
9002 6939
201701 9002
208627 207910

I'm not sure all the these sightings of st

Re: BGP and The zero window edge

2021-04-21 Thread Job Snijders via NANOG
On Wed, Apr 21, 2021 at 09:22:57PM +, Jakob Heitz (jheitz) wrote:
> I'd like to get some data on what actually happened in the real cases
> and analyze it.
>
> [snip]
> 
> TCP zero window is possible, but many other things could
> cause it too.

Indeed. There could be a number of reasons that caused it.

Switchings away from TCP win=0 towards "Zombie Routes":

*RIGHT NOW* (at the moment of writing), there are a number of zombie
route visible in the IPv6 Default-Free Zone:

One example is 
http://lg.ring.nlnog.net/prefix_detail/lg01/ipv6?q=2a0b:6b86:d15::/48

2a0b:6b86:d15::/48 via:
BGP.as_path: 204092 57199 35280 6939 42615 42615 212232
BGP.as_path: 208627 207910 57199 35280 6939 42615 42615 212232
BGP.as_path: 208627 207910 57199 35280 6939 42615 42615 212232
(first announced April 15th, last withdrawn April 15th, 2021)

Another one is 
http://lg.ring.nlnog.net/prefix_detail/lg01/ipv6?q=2a0b:6b86:d24::/48

2a0b:6b86:d24::/48 via:
BGP.as_path: 201701 9002 6939 42615 212232
BGP.as_path: 34927 9002 6939 42615 212232
BGP.as_path: 207960 34927 9002 6939 42615 212232
BGP.as_path: 44103 50673 9002 6939 42615 212232
BGP.as_path: 208627 207910 34927 9002 6939 42615 212232
BGP.as_path: 3280 34927 9002 6939 42615 212232
BGP.as_path: 206628 34927 9002 6939 42615 212232
BGP.as_path: 208627 207910 34927 9002 6939 42615 212232
(first announced March 24th, last withdrawn March 24th, 2021)

Just now, I literally rebooted the BGP speaker behind lg.ring.nlnog.net
to make ensure that those routes are not stuck in the BGP looking glass
itself. 

2a0b:6b86:d24::/48 was first announced on March 24th, 2021, and
withdrawn at the end of March 24th, 2021 by the originator, and now
almost a month later, this prefix still is visible in the default-free
zone despite WITHDRAW messages having been sent and the AS 212232
operator confirming they are not announcing that IP prefix anywhere.

I checked the AS 6939 Looking glass, but the d24::/48 route is not
visible in the http://lg.he.net/ web interface. This leads me to believe
the the route got stuck somewhere along way in either of 201701, 204092,
206628, 207910, 207960, 208627, 3280, 34927, 35280, 44103, 50673, 57199,
and/or 9002.

This implies indeed might be multiple reasons a BGP route gets stuck
('stuck' as in - a WITHDRAW was not generated, or ignored). Perhaps on
any one of these edges there is a very high Out Queue for one reason or
another:

34927 9002
206628 34927
44103 50673
207960 34927
3280 34927
9002 6939
201701 9002
208627 207910

I'm not sure all the these sightings of stuck routes can be pinpointed
to one specific BGP vendor (or one bug).

Kind regards,

Job


Re: BGP and The zero window edge

2021-04-21 Thread Pawel Malachowski
Dnia Wed, Apr 21, 2021 at 08:59:06PM +, Jakob Heitz (jheitz) via NANOG 
napisał(a):

> Has anyone else seen this before or can provide data to analyze?
> On or off list.

- https://labs.ripe.net/author/romain_fontugne/bgp-zombies/
- https://www.slideshare.net/atendesoftware/bgp-zombie-routes


kind regards,
-- 
Pawel Malachowski


RE: BGP and The zero window edge

2021-04-21 Thread Jakob Heitz (jheitz) via NANOG
I'd like to get some data on what actually happened
in the real cases and analyze it.

If it's a Cisco router at fault, then we have a bug to fix.
Even if it's not a Cisco, there may be ways we can help
to avoid the situation.
However, before we start on solutions, I'd like to get
a good understanding of what actually happened.

TCP zero window is possible, but many other things could
cause it too.

Anyone?

Regards,
Jakob.

-Original Message-
From: Job Snijders  
Sent: Wednesday, April 21, 2021 2:11 PM
To: Jakob Heitz (jheitz) 
Cc: nanog@nanog.org
Subject: Re: BGP and The zero window edge

Dear Jakob, group,

On Wed, Apr 21, 2021 at 08:59:06PM +, Jakob Heitz (jheitz) via NANOG wrote:
> Ben's blog details an experiment in which he advertises routes and then
> withdraws them, but some of them remain stuck for days.
> 
> I'd like to get to the bottom of this problem.

I think there are *two* problems:

1) some BGP implementations (or multi-node BGP configurations) sometimes
   end up getting stuck in one way or another.

2) other BGP nodes are not able to disconnect/reconnect to systems
   suffering from instantiations of problem #1.

While on the one hand it is important to follow-up on each and every
instantiation of problem #1, I personally think it also is worthwhile
exploring whether the BGP FSM itself can be redefined in a way that
encourages BGP protocol implementations to be more robust and rely less
on the remote peer behaving correctly.

Once Problem #2 is addressed, finding and isolating instances of Problem
#1 will become much easier.

> Has anyone else seen this before or can provide data to analyze?
> On or off list.

>From the BGP Default-Free Zone perspective it is hard to differentiate
between an entire (multi-vendor) Autonomous System being stuck, or just
one router.

To test individual router implementations this tool is useful
https://github.com/benjojo/bgp-zerowindow-test - but please keep in mind
that "TCP Recv Wind == 0" trick is just one way to easily get a BGP peer
to manifest the problematic behavior.

>From a BGP protocol perspective BGP nodes shouldn't inspect the TCP
receive window, but rather focus on whether all locally available
signals indicate that the remote peer is still progressing data.

Kind regards,

Job


Re: BGP and The zero window edge

2021-04-21 Thread Job Snijders via NANOG
Dear Jakob, group,

On Wed, Apr 21, 2021 at 08:59:06PM +, Jakob Heitz (jheitz) via NANOG wrote:
> Ben's blog details an experiment in which he advertises routes and then
> withdraws them, but some of them remain stuck for days.
> 
> I'd like to get to the bottom of this problem.

I think there are *two* problems:

1) some BGP implementations (or multi-node BGP configurations) sometimes
   end up getting stuck in one way or another.

2) other BGP nodes are not able to disconnect/reconnect to systems
   suffering from instantiations of problem #1.

While on the one hand it is important to follow-up on each and every
instantiation of problem #1, I personally think it also is worthwhile
exploring whether the BGP FSM itself can be redefined in a way that
encourages BGP protocol implementations to be more robust and rely less
on the remote peer behaving correctly.

Once Problem #2 is addressed, finding and isolating instances of Problem
#1 will become much easier.

> Has anyone else seen this before or can provide data to analyze?
> On or off list.

>From the BGP Default-Free Zone perspective it is hard to differentiate
between an entire (multi-vendor) Autonomous System being stuck, or just
one router.

To test individual router implementations this tool is useful
https://github.com/benjojo/bgp-zerowindow-test - but please keep in mind
that "TCP Recv Wind == 0" trick is just one way to easily get a BGP peer
to manifest the problematic behavior.

>From a BGP protocol perspective BGP nodes shouldn't inspect the TCP
receive window, but rather focus on whether all locally available
signals indicate that the remote peer is still progressing data.

Kind regards,

Job


RE: BGP and The zero window edge

2021-04-21 Thread Jakob Heitz (jheitz) via NANOG
Ben's blog details an experiment in which he advertises routes and then
withdraws them, but some of them remain stuck for days.

I'd like to get to the bottom of this problem.

Has anyone else seen this before or can provide data to analyze?
On or off list.

Regards,
Jakob.

-Original Message-
Date: Wed, 21 Apr 2021 07:31:10 -0400
From: "Jean St-Laurent" 

Nice article explaining a specific BGP corner case not removing routes when
TCP window reaches 0.

https://blog.benjojo.co.uk/post/bgp-stuck-routes-tcp-zero-window

The proposed solution is a new RFC for BGP with the suggestion to introduce
a new timer.

Fascinating!

Jean St-Laurent /CISSP
ddosTest me security inc
site:? https://ddostest.me 


BGP and The zero window edge

2021-04-21 Thread Jean St-Laurent via NANOG
Nice article explaining a specific BGP corner case not removing routes when
TCP window reaches 0.

https://blog.benjojo.co.uk/post/bgp-stuck-routes-tcp-zero-window

The proposed solution is a new RFC for BGP with the suggestion to introduce
a new timer.

Fascinating!

Jean St-Laurent /CISSP
ddosTest me security inc
site:  https://ddostest.me