Ciprian Marius Vizitiu wrote:
> Hi listers,
> 
> I have a strange firewall problem with Bacula 2.2.6 running on RHEL4 
> (2.6.9-67 but it happens on other RHEL4 kernels too) clients and CentOS5 
> server. The description of the problem is... long and ugly so I've 
> managed to narrow it down to the following easy (for me) to reproduce 
> scenario:
> 
> 1. One RHEL4 Bacula 2.2.6 client, 192.168.1.25. Relevant iptables in 
> this client:
> 
> -A RH-Firewall-1-INPUT -p tcp --dport 9101:9103 -j ACCEPT
> -A RH-Firewall-1-INPUT -p udp --dport 9101:9103 -j ACCEPT
> 
> 2. One Bacula 2.2.6 server, 192.168.1.48. Relevant iptables in this server:
> 
> -A RH-Firewall-1-INPUT -p tcp --dport 9101:9103 -j ACCEPT
> -A RH-Firewall-1-INPUT -p udp --dport 9101:9103 -j ACCEPT
> 
> Although there is no 3Com router involved "Hearbeat Interval" is set to 
> 60s. 
> 
> Now, simply start a 23GB restore (full plus a differential) consisting 
> of ~70.000 files on the client... everything works as expected for like 
> 30 minutes during which the client writes 23GB. Then things start to go 
> strange:
> 
> 1. On the client there is no activity
> 2. On the server bacula-sd is busy on CPU and I/O most likely searching 
> through the 10 x 200GB disk volumes for the differential files to restore.
> 
> This "state" will last for another ~30 minutes during which a tcpdump 
> will only hear the pings from the heartbeat. Depending on whether the 
> firewalls are started or not the end can be one of the following:
> 
> No firewall: restore job always ends successfully.
> No firewall: Depending on the positions of the planets either the job 
> will succeed THREE HOURS later =:-o or (more likely...) it'll fail with 
> a "no route to host" error.  Tcpdump started when baculs-sd's job is 
> nearing the end will clearly show the culprit:
> 
> [... Heartbeat...]
> 
> 18:32:01.504760 IP server.gbif.org.9103 > client.gbif.org.32776: P 
> 1560794395:1560794427(32) ack 1414218623 win 181 <nop,nop,timestamp 
> 4070418385 22509939>
> 18:32:01.504801 IP client.gbif.org > server.gbif.org: icmp 92: host 
> client.gbif.org unreachable - admin prohibited
> 18:32:01.505214 IP server.gbif.org.9103 > client.gbif.org.32776: . 
> 32:1480(1448) ack 1 win 181 <nop,nop,timestamp 4070418386 22509939>
> 18:32:01.505231 IP client.gbif.org > server.gbif.org: icmp 556: host 
> client.gbif.org unreachable - admin prohibited
> 18:32:01.505236 IP server.gbif.org.9103 > client.gbif.org.32776: . 
> 1480:2928(1448) ack 1 win 181 <nop,nop,timestamp 4070418386 22509939>
> 18:32:01.505249 IP client.gbif.org > server.gbif.org: icmp 556: host 
> client.gbif.org unreachable - admin prohibited
> 
> To me it looks like the essence of the problem is the fact that the 
> restore session has a long "network idle" period and somehow the RELATED 
> mechanism of the firewall no longer works. WHY would this happen? And 
> more important, isn't this what HeartBeat was supposed to prevent in the 
> first place? One more detail: if the client is RHEL5 everything works 
> perfectly.
> 
> Has anyone seen something like this before? Any ideas will be 
> appreciated! :-|
> 


not sure fo 100% but looks a bit like TCP TTL
dont think FW will wait that long and it has nothing to do with heartbeat.
will say/guess as your FW treat it as session closed or timed out cos of
idle time

check if you can manage TTL for TCP on FW.



-- 
bEsT rEgArDs            |       "Confidence is what you have before you
tomasz dereszynski      |       understand the problem." -- Woody Allen
                        |       
Spes confisa Deo        |       "In theory, theory and practice are much
numquam confusa recedit |       the same. In practice they are very
                        |       different." -- Albert Einstein


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to