> > Try:
> >
> > echo 300 > /proc/sys/net/ipv4/tcp_keepalive_time
> > echo 60 > /proc/sys/net/ipv4/tcp_keepalive_intvl
> > echo 10 > /proc/sys/net/ipv4/tcp_keepalive_probes
> >
> > This will send the first keepalive after 5 minutes, and then every
60
> > seconds after that, and will drop the connection if no response is
seen
> > from 10 consecutive probes.
> >
> > The default is that the keepalive won't start for 2 hours (7200
seconds)
> > after the connection has been idle. Not much good in your case.
> 
> Well life is not all that simple.  You haven't mentioned what *all*
the
> details happens when you set the default keepalive from 2 hours down
to
> 300
> seconds, which could be fatal to many programs.  I haven't looked at
the
> details of this for the last 7 years, but if I recall correctly, once
the
> keepalive time has expired, the OS will attempt to contact the other
end,
> which if keepalive has not been enabled by the application, will cause
the
> line to terminate.

Do you have a source for this? I'm pretty sure that you aren't recalling
correctly. I think that those keepalive settings only come into effect
if SO_KEEPALIVE has been set by the application.

> keepalive was not designed to refresh routers but to ensure that
inactive
> dead connections in an OS are eventually detected and closed.

Correct, but I have successfully used it at a client site when a Cisco
router was dropping connections prematurely. The original scenario was:

PC --- router --- internet --- router --- Server

Users local to the server were fine, but users on the remote PC's would
go away from their desks, come back, and as soon as they hit a key they
would get the "Connection Closed" message. Both the client and server
software were closed source and with no option to enable keepalives, so
what I did was this:

PC --- router --- internet --- router --- Server
                                       |
                                       |- Linux Server

(if the ASCII art comes out all wrong, it should look like the Linux
Server is connected to the lan segment between the Server and the
Router)

The Linux server ran 'simpleproxy' (I think) which was literally just a
program that accepted a TCP connection from the PC, created a connection
to the Server, and forwarded packets between them. It didn't support
SO_KEEPALIVE initially, but that was pretty easy to add, and once added,
all the problems went away!

> Routers should keep
> all lines open for a minimum of 3 hours of idle time (IMO).

I wonder if there is an RFC for this... I know I'm being pedantic here,
but routers do not (should not) track connections here, it is firewalls
that will track connections, time them out, and then treat subsequent
packets on those connections as 'wtf is this packet?' and send
reset/unreachable responses.

> >
> > Does Bacula include any application level keepalives?
> 
> Bacula sets keepalive on all its sockets when they are opened.
> 
> > There should be no
> > need to do this if you've set /proc and setsockopt correctly, unless
the
> > respective daemons implement their own application level timeouts.
> 
> Yes, providing you don't mind prematurely killing off non-keepalive
> programs
> that are inactive during the reduced keepalive period you have set.

This should be relatively easy to test... assuming we can't find a
document somewhere that clarifies it one way or another.

James


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to