Re: BGP: connect delay timer vs connect retry timer

2023-08-29 Thread Ondrej Zajicek
On Tue, Aug 29, 2023 at 09:11:27PM +0530, Bala Sajja wrote:
> Hi Ondrej,
>  OK. understood. We can configure connect-retry-interval in the
> range 1..65535. We are trying to test this with different values as
> pasted below with logs. Things work as expected till the 1..25 seconds
> range. After that we always see the connection delay timer getting
> triggered, connect retry timer config has no impact.

This is really unrelated to connection delay timer. You get this behavior
because besides our connect retry timer there is also OS-level timer for
TCP connections. If our retry timer is small, it timeouts and we retry the
connection.

If our timer is large, we get socket error from OS first (that is the log
message 'Connection lost (Connection timed out)'), which leads to
complete restart of BGP state machine, including going through the
initial connection delay time.

It is true that we could likely avoid full restart from socket error when
we are still in the CONNECT state, and wait for our connect retry timer.


First case:
> Line 167: Aug 29 14:02:36 bird: bgp_10: Connecting to 10.220.152.55
> from local address 10.220.152.53
> Line 185: Aug 29 14:02:53 bird: bgp_10: Connecting to 10.220.152.55
> from local address 10.220.152.53
> Line 191: Aug 29 14:03:12 bird: bgp_10: Connecting to 10.220.152.55
> from local address 10.220.152.53
> Line 195: Aug 29 14:03:30 bird: bgp_10: Connecting to 10.220.152.55
> from local address 10.220.152.53
> Line 214: Aug 29 14:03:48 bird: bgp_10: Connecting to 10.220.152.55
> from local address 10.220.152.53


Second case:
> Line 297: Aug 29 14:05:12 bird: bgp_10: Started
> Line 298: Aug 29 14:05:12 bird: bgp_10: Connect delayed by 5 seconds
> Line 308: Aug 29 14:05:16 bird: bgp_10: Connecting to 10.220.152.55
> from local address 10.220.152.53
> Line 348: Aug 29 14:05:48 bird: bgp_10: Connection lost (Connection timed out)
> Line 349: Aug 29 14:05:48 bird: bgp_10: Connect delayed by 5 seconds
> Line 351: Aug 29 14:05:52 bird: bgp_10: Connecting to 10.220.152.55
> from local address 10.220.152.53
> Line 360: Aug 29 14:06:24 bird: bgp_10: Connection lost (Connection timed out)
> Line 361: Aug 29 14:06:24 bird: bgp_10: Connect delayed by 5 seconds
> Line 363: Aug 29 14:06:28 bird: bgp_10: Connecting to 10.220.152.55
> from local address 10.220.152.53
> Line 384: Aug 29 14:07:00 bird: bgp_10: Connection lost (Connection timed out)

-- 
Elen sila lumenn' omentielvo

Ondrej 'Santiago' Zajicek (email: santi...@crfreenet.org)
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."


Re: BGP: connect delay timer vs connect retry timer

2023-08-29 Thread Bala Sajja
Hi Ondrej,
 OK. understood. We can configure connect-retry-interval in the
range 1..65535. We are trying to test this with different values as
pasted below with logs. Things work as expected till the 1..25 seconds
range. After that we always see the connection delay timer getting
triggered, connect retry timer config has no impact.

Below are the logs with different values of connect-retry-interval
//connect-retry-interval 1
Line 35: Aug 29 14:01:21 bird: bgp_10: Started
Line 36: Aug 29 14:01:21 bird: bgp_10: Connect delayed by 5 seconds
Line 38: Aug 29 14:01:25 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 61: Aug 29 14:01:50 bird: Restarting protocol bgp_10
Line 62: Aug 29 14:01:50 bird: bgp_10: Shutting down
Line 63: Aug 29 14:01:50 bird: bgp_10: Shutdown requested
Line 64: Aug 29 14:01:50 bird: bgp_10: State changed to stop
Line 68: Aug 29 14:01:50 bird: bgp_10: Down
Line 69: Aug 29 14:01:50 bird: bgp_10: State changed to flush
Line 70: Aug 29 14:01:50 bird: bgp_10: State changed to down
Line 71: Aug 29 14:01:50 bird: bgp_10: Initializing
Line 72: Aug 29 14:01:50 bird: bgp_10: Starting
Line 73: Aug 29 14:01:50 bird: bgp_10: State changed to start
Line 75: Aug 29 14:01:50 bird: bgp_10: Started
Line 76: Aug 29 14:01:50 bird: bgp_10: Connect delayed by 5 seconds
Line 87: Aug 29 14:01:55 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 88: Aug 29 14:01:56 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 89: Aug 29 14:01:57 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 90: Aug 29 14:01:58 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 91: Aug 29 14:01:59 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 92: Aug 29 14:02:00 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 93: Aug 29 14:02:01 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 95: Aug 29 14:02:01 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 96: Aug 29 14:02:02 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
//connect-retry-interval 5
Line 101: Aug 29 14:02:03 bird: Restarting protocol bgp_10
Line 102: Aug 29 14:02:03 bird: bgp_10: Shutting down
Line 103: Aug 29 14:02:03 bird: bgp_10: Shutdown requested
Line 104: Aug 29 14:02:03 bird: bgp_10: State changed to stop
Line 108: Aug 29 14:02:03 bird: bgp_10: Down
Line 109: Aug 29 14:02:03 bird: bgp_10: State changed to flush
Line 110: Aug 29 14:02:03 bird: bgp_10: State changed to down
Line 111: Aug 29 14:02:03 bird: bgp_10: Initializing
Line 112: Aug 29 14:02:03 bird: bgp_10: Starting
Line 113: Aug 29 14:02:03 bird: bgp_10: State changed to start
Line 115: Aug 29 14:02:03 bird: bgp_10: Started
Line 116: Aug 29 14:02:03 bird: bgp_10: Connect delayed by 5 seconds
Line 126: Aug 29 14:02:07 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 128: Aug 29 14:02:12 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 130: Aug 29 14:02:15 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 131: Aug 29 14:02:19 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 134: Aug 29 14:02:24 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 135: Aug 29 14:02:28 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
//connect-retry-interval 20
Line 141: Aug 29 14:02:31 bird: Restarting protocol bgp_10
Line 142: Aug 29 14:02:31 bird: bgp_10: Shutting down
Line 143: Aug 29 14:02:31 bird: bgp_10: Shutdown requested
Line 144: Aug 29 14:02:31 bird: bgp_10: State changed to stop
Line 148: Aug 29 14:02:31 bird: bgp_10: Down
Line 149: Aug 29 14:02:31 bird: bgp_10: State changed to flush
Line 150: Aug 29 14:02:31 bird: bgp_10: State changed to down
Line 151: Aug 29 14:02:31 bird: bgp_10: Initializing
Line 152: Aug 29 14:02:31 bird: bgp_10: Starting
Line 153: Aug 29 14:02:31 bird: bgp_10: State changed to start
Line 155: Aug 29 14:02:31 bird: bgp_10: Started
Line 156: Aug 29 14:02:31 bird: bgp_10: Connect delayed by 5 seconds
Line 167: Aug 29 14:02:36 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 185: Aug 29 14:02:53 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 191: Aug 29 14:03:12 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 195: Aug 29 14:03:30 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53
Line 214: Aug 29 14:03:48 bird: bgp_10: Connecting to 10.220.152.55
from local address 10.220.152.53


//connect-retry-interval 30
Line 221: Aug 29 14:03:59 bird: Restarting protocol bgp_10
Line 222: Aug 29 14:03:59 bird: bgp_10: Shutting down
Line 223: Aug 29 14:03:59 bird: bgp_10: Shutdown requested
Line 224: Aug 29 14:03:59 bird: 

Re: BGP: connect delay timer vs connect retry timer

2023-08-29 Thread Ondrej Zajicek
On Mon, Aug 28, 2023 at 05:02:40PM +0530, Bala Sajja wrote:
> Hi,
>It seems where BGP connect retry timer to be kicked in use cases
> also, connect delay timer gets kicked in. In the below function,
> though bgp_setup_conn(p, conn) sets connect_timer to connect retry
> timer, later it's over written by bgp_start_timer(conn->connect_timer,
> delay) which  sets connect_timer to connect delay timer. Is this right
> ? Could we make the connect retry timer work properly ?

Hi

I am not sure what do you mean. The timer connect_timer is used for two
purposes, for connect_delay_time before we try to connnect (in the state
BS_ACTIVE), and for connect_retry_time while we try to connect (in the
state BS_CONNECT). The first is initialized in bgp_active(), the second
in bgp_connect(). The call bgp_setup_conn(p, conn) just allocates the
timer, but it does not sets it to any interval.


> 
> bgp_active(struct bgp_proto *p)
> {
>   int delay = MAX(1, p->cf->connect_delay_time);
>   struct bgp_conn *conn = >outgoing_conn;
> 
>   BGP_TRACE(D_EVENTS, "Connect delayed by %d seconds", delay);
>   bgp_setup_conn(p, conn);
>   bgp_conn_set_state(conn, BS_ACTIVE);
>   bgp_start_timer(conn->connect_timer, delay);
> }

-- 
Elen sila lumenn' omentielvo

Ondrej 'Santiago' Zajicek (email: santi...@crfreenet.org)
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."


BGP: connect delay timer vs connect retry timer

2023-08-28 Thread Bala Sajja
Hi,
   It seems where BGP connect retry timer to be kicked in use cases
also, connect delay timer gets kicked in. In the below function,
though bgp_setup_conn(p, conn) sets connect_timer to connect retry
timer, later it's over written by bgp_start_timer(conn->connect_timer,
delay) which  sets connect_timer to connect delay timer. Is this right
? Could we make the connect retry timer work properly ?

bgp_active(struct bgp_proto *p)
{
  int delay = MAX(1, p->cf->connect_delay_time);
  struct bgp_conn *conn = >outgoing_conn;

  BGP_TRACE(D_EVENTS, "Connect delayed by %d seconds", delay);
  bgp_setup_conn(p, conn);
  bgp_conn_set_state(conn, BS_ACTIVE);
  bgp_start_timer(conn->connect_timer, delay);
}


Regards,
Bala.