On Thursday 01 February 2007 01:48, James Harper wrote:
> > I think that makes any Bacula job longer than 10 minutes impossible
>
> using
>
> > this Linksys router.  Looks like I'm out of luck.  I have updated to
>
> the
>
> > newest firmware, and the Linksys config doesn't have any ability to
>
> modify
>
> > the timeout value.   I suppose I could buy a new router, or set up a
>
> new
>
> > offsite backup storage daemon.   Unless anyone else has any brilliant
> > ideas  :)
> >
> > Kern, if you are reading this, what are the chances that a heartbeat
>
> could
>
> > be implemented between the director and the storage daemon?
>
> Is Bacula using setsockopt SO_KEEPALIVE's on every socket (it seems that
> it does in bnet.c)? If so, you probably just need to set this to more
> sensible values in /proc (assuming you are running Linux... I can't
> remember if you said or not).
>
> Try:
>
> echo 300 > /proc/sys/net/ipv4/tcp_keepalive_time
> echo 60 > /proc/sys/net/ipv4/tcp_keepalive_intvl
> echo 10 > /proc/sys/net/ipv4/tcp_keepalive_probes
>
> This will send the first keepalive after 5 minutes, and then every 60
> seconds after that, and will drop the connection if no response is seen
> from 10 consecutive probes.
>
> The default is that the keepalive won't start for 2 hours (7200 seconds)
> after the connection has been idle. Not much good in your case.

Well life is not all that simple.  You haven't mentioned what *all* the 
details happens when you set the default keepalive from 2 hours down to 300 
seconds, which could be fatal to many programs.  I haven't looked at the 
details of this for the last 7 years, but if I recall correctly, once the 
keepalive time has expired, the OS will attempt to contact the other end, 
which if keepalive has not been enabled by the application, will cause the 
line to terminate.  If keepalive has been set enable, then the OS will go 
through its little jig probing ad the specified interval and if it gets a 
response, reset the timeout to the full value, and continue. Otherwise, at 
the end of the probes with no response the line will be disconnected.

The important point above is that if an application is not keepalive aware (as 
is probably the case for 99% of the applications), any idle period of more 
than the keepalive time set in the OS will terminate the line.  By an 
application being keepalive aware, I mean that the application must 
explicitly request the OS to send keepalive probes.  Otherwise keepalive is 
turned off and the OS immediately terminates the line after the keepalive 
timeout has expired.


keepalive was not designed to refresh routers but to ensure that inactive dead 
connections in an OS are eventually detected and closed.  Routers should keep 
all lines open for a minimum of 3 hours of idle time (IMO).

>
> http://libkeepalive.sourceforge.net/docs/TCP-Keepalive-HOWTO has some
> useful info about the way it all works.

Yes, this is a good reference, but I found it slightly vague about the 
boundaries between the OS and user domains are.

>
> Does Bacula include any application level keepalives? 

Bacula sets keepalive on all its sockets when they are opened.

> There should be no 
> need to do this if you've set /proc and setsockopt correctly, unless the
> respective daemons implement their own application level timeouts.

Yes, providing you don't mind prematurely killing off non-keepalive programs 
that are inactive during the reduced keepalive period you have set.

Best regards,

Kern

>
> James
>
>
> -------------------------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job
> easier. Download IBM WebSphere Application Server v.1.0.1 based on Apache
> Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to