Hello,
We've recently begun testing using OpenBSD 4.4 with OpenBGP in our datacenter.
Our initial tests have uncovered an odd issue we hope you all can help us
with. I've included our configs and relevant information below.
The summary of our issue is this:
1.) Upon starting bgpd the session between the two routers goes to established
and updates are passed.
2.) Keepalives aren't passed beyond the first exchange.
3.) After some time, the session goes to IDLE on both routers.
4.) The session tears down if we either issue a bgpctl command (like show
summary or show neighbors) or wait 240 seconds after the initial connect.
5.) The routers then reestablish connections but they drop again.
6.) The exact same setup works fine with OpenBGP 4.3.
Here's what we've found. If we modify session.c at line 405 (timeout = 240;
/* loop every 240s at least */) to some number lower than our holdtime, it
works. Adding debugging code to the code after that line shows us that the
code doesn't get processed again after the intial setup unless the timeout
value is reached or some bgpctl statement is excecuted.
We've replicated this error in two different test environments. The error
causes sessions to be torn down anytime a 4.4 bgpd is used. (ie 4.4 - 4.4 and
4.4 - 4.3).
Please let me know if you need any additional information from me.
Thanks so much,
Marc Runkel
Technical Operations Manger
Untangle, Inc.
The two machines in question are dcrouter1 and bgptest2:
dcrouter1:/etc/bgpd.conf
#macros
# XO Peer
XOpeer=65.46.252.33
# global configuration
AS 21634
router-id 65.46.252.34
log updates
network 64.2.3.0/24
holdtime min 3
holdtime 90
# neighbors and peers
neighbor $XOpeer {
remote-as 2828
descr XO Upstream
local-address 65.46.252.34
multihop2
}
# filter out prefixes longer than 24 or shorter than 8 bits
deny from any
allow from any inet prefixlen 8 - 24
# do not accept a default route
deny from any prefix 0.0.0.0/0
# We're in test mode, so we gotta let the test networks in (192.168.0.0/16).
# filter bogus networks
deny from any prefix 10.0.0.0/8 prefixlen = 8
deny from any prefix 172.16.0.0/12 prefixlen = 12
#deny from any prefix 192.168.0.0/16 prefixlen = 16
deny from any prefix 169.254.0.0/16 prefixlen = 16
deny from any prefix 192.0.2.0/24 prefixlen = 24
deny from any prefix 224.0.0.0/4 prefixlen = 4
deny from any prefix 240.0.0.0/4 prefixlen = 4
-- END --
dcrouter1:/etc/hostname.em0
inet 65.46.252.34 255.255.255.252 65.46.252.35 description XO WAN
-- END --
dcrouter1:/var/log/daemon.log (bgpd only)
Jan 20 11:19:51 dcrouter1 bgpd[24217]: startup
Jan 20 11:19:51 dcrouter1 bgpd[14770]: route decision engine ready
Jan 20 11:19:52 dcrouter1 bgpd[5962]: listening on 0.0.0.0
Jan 20 11:19:52 dcrouter1 bgpd[5962]: listening on ::
Jan 20 11:19:52 dcrouter1 bgpd[5962]: session engine ready
Jan 20 11:19:52 dcrouter1 bgpd[5962]: neighbor 65.46.252.33 (XO Upstream):
state change None - Idle, reason: None
Jan 20 11:19:52 dcrouter1 bgpd[5962]: neighbor 65.46.252.33 (XO Upstream):
state change Idle - Connect, reason: Start
Jan 20 11:19:52 dcrouter1 bgpd[5962]: neighbor 65.46.252.33 (XO Upstream):
socket error: Connection refused
Jan 20 11:19:52 dcrouter1 bgpd[5962]: neighbor 65.46.252.33 (XO Upstream):
state change Connect - Active, reason: Connection open failed
Jan 20 11:19:56 dcrouter1 bgpd[5962]: neighbor 65.46.252.33 (XO Upstream):
state change Active - OpenSent, reason: Connection opened
Jan 20 11:19:56 dcrouter1 bgpd[5962]: neighbor 65.46.252.33 (XO Upstream):
state change OpenSent - OpenConfirm, reason: OPEN message received
Jan 20 11:19:56 dcrouter1 bgpd[5962]: neighbor 65.46.252.33 (XO Upstream):
state change OpenConfirm - Established, reason: KEEPALIVE message received
Jan 20 11:19:56 dcrouter1 bgpd[14770]: neighbor 65.46.252.33 (XO Upstream)
AS2828: update 192.168.42.0/24 via 65.46.252.33
Jan 20 11:19:56 dcrouter1 bgpd[24217]: nexthop 65.46.252.33 now valid:
directly connected
Jan 20 11:20:44 dcrouter1 bgpd[5962]: neighbor 65.46.252.33 (XO Upstream):
received notification: HoldTimer expired, unknown subcode 0
Jan 20 11:20:44 dcrouter1 bgpd[5962]: neighbor 65.46.252.33 (XO Upstream):
state change Established - Idle, reason: NOTIFICATION received
-- END --
dcrouter1:tcpdump -vvns1500 -i em0 port 179
tcpdump: listening on em0, link-type EN10MB
11:19:52.537633 65.46.252.34.48310 65.46.252.33.179: S [tcp sum ok]
164215:164215(0) win 16384 mss 1460,nop,nop,sackOK,nop,wscale
0,nop,nop,timestamp 2322120143 0 (DF) [tos 0xc0] (ttl 2, id 23223, len 64)
11:19:52.537747 65.46.252.33.179 65.46.252.34.48310: R [tcp sum ok] 0:0(0)
ack 164216 win 0 (DF) (ttl 64, id 40395, len 40)11:19:56.759172
65.46.252.33.1985 65.46.252.34.179: S [tcp sum ok] 2516427034:2516427034(0)
win 16384 mss 1460,nop,nop,sackOK,nop,wscale 0,nop,nop,timestamp 1931362699
0 (DF) [tos 0xc0] (ttl 2