Re: [Bacula-users] Bacula via NATed connection and Bacula docs - partly solved. Launchd!

2010-01-28 Thread Dirk H. Schulz
The more I look into it, the more it gets weird.

Gavin McCullagh schrieb:
 On Wed, 27 Jan 2010, Dirk H. Schulz wrote:

   
 Telnetting from external-fd to server-sd using the above mentionened FQDN
 and the port of the storage daemon (telnet storage.server.sd 9103)
 outputs exactly the same as telnetting internally to that port.  Afaik,
 that means: bacula-fd on the external client should be able to connect to
 bacula-sd on the internal server.

 But it does not. Running a backup job for this client the director is 
 quite a long time waiting for Client ... to connect to Storage ... and 
 eventually gives up.
 

 In this instance, I would be inclined to start a tcpdump like that below on
 both the -fd and -sd, start your backup and see where exactly the -fd tries
 to connect to.
   tcpdump -ni ethX tcp port 9103

 The first question I suppose is to see what IP address the -fd is actually
 using to connect.  The second is does the tcp handshake happen correctly
 and if so what happens then.  Perhaps the -fd is connecting to the wrong
 IP, or it could be a firewall issue, or something else...?  
   
First: I made the test with all firewalls on the way shut down (except 
the one doing NAT) to avoid any issues from there.
Then I made a similar test with a different client-fd in the same public 
subnet, and it worked.
I have thoroughly compared the configuration of these two clients (both 
bacula-fd.conf and bacula-dir.conf).

Still nothing works. And here is what tcpdump and bacula-dir output:

 external-fd:~ root# tcpdump -ni en1 portrange 9101-9103
 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
 listening on en1, link-type EN10MB (Ethernet), capture size 96 bytes
 08:01:39.346580 IP 1.2.3.4.32930  40.50.60.70.9102: S 
 1415949915:1415949915(0) win 5840 mss 1452,sackOK,timestamp 939681199 
 0,nop,wscale 7
 08:01:39.346647 IP 40.50.60.70.9102  1.2.3.4.32930: S 
 221080258:221080258(0) ack 1415949916 win 65535 mss 1460,nop,wscale 
 3,nop,nop,timestamp 213499348 939681199,sackOK,eol
 08:01:39.387055 IP 1.2.3.4.32930  40.50.60.70.9102: . ack 1 win 46 
 nop,nop,timestamp 939681241 213499348
 08:01:39.387073 IP 40.50.60.70.9102  1.2.3.4.32930: . ack 1 win 65535 
 nop,nop,timestamp 213499348 939681241
 08:01:39.391051 IP 1.2.3.4.32930  40.50.60.70.9102: P 1:51(50) ack 1 
 win 46 nop,nop,timestamp 939681244 213499348
 08:01:39.391065 IP 40.50.60.70.9102  1.2.3.4.32930: . ack 51 win 
 65535 nop,nop,timestamp 213499348 939681244
 08:06:22.221818 IP 40.50.60.70.9102  1.2.3.4.32930: . ack 51 win 0
 08:06:22.221853 IP 40.50.60.70.9102  1.2.3.4.32930: . ack 51 win 
 65535 nop,nop,timestamp 213502176 939681244
 08:06:22.262232 IP 1.2.3.4.32930  40.50.60.70.9102: . ack 1 win 46 
 nop,nop,timestamp 939964161 213499348
 08:11:07.236737 IP 40.50.60.70.9102  1.2.3.4.32930: . ack 51 win 0
 08:11:07.236780 IP 40.50.60.70.9102  1.2.3.4.32930: . ack 51 win 
 65535 nop,nop,timestamp 213505026 939964161
 08:11:07.279418 IP 1.2.3.4.32930  40.50.60.70.9102: . ack 1 win 46 
 nop,nop,timestamp 940249226 213502176
 08:11:44.501513 IP 1.2.3.4.32930  40.50.60.70.9102: F 51:51(0) ack 1 
 win 46 nop,nop,timestamp 940286454 213505026
 08:11:44.501542 IP 40.50.60.70.9102  1.2.3.4.32930: . ack 52 win 
 65535 nop,nop,timestamp 213505399 940286454
All the while bacula-dir claims waiting for Client external-fd to 
connect to Storage LTO2 there is not one attempt at connecting to SD 
from this client!

And in the end the error message from bacula-dir is something different:

 8-Jan 08:11 bacula-dir JobId 33: Fatal error: Unable to authenticate 
 with File daemon at external-fd.domain.de:9102. Possible causes:
 Passwords or names not the same or
 Maximum Concurrent Jobs exceeded on the FD or
 FD networking messed up (restart daemon).
 Please see 
 http://www.bacula.org/en/rel-manual/Bacula_Freque_Asked_Questi.html#SECTION00376
  
 for help.
 28-Jan 08:11 bacula-dir JobId 33: Fatal error: Network error with FD 
 during Backup: ERR=Unterbrechung während des Betriebssystemaufrufs
 28-Jan 08:11 bacula-dir JobId 33: Fatal error: No Job status returned 
 from FD.
 28-Jan 08:11 bacula-dir JobId 33: Error: Bacula bacula-dir 3.0.3 
 (18Oct09): 28-J
I have even tried without any passwords, I have copied and pasted the 
client name everywhere to make sure there is no typo in there.

And then - just from pure desperation - I started it bacula-fd manually 
instead of via launchd (with the same parameters launchd is given) - and 
now it works!

Somehow communication does not work correctly if bacula-fd is started 
via launchd (/sbin/bacula-fd -f -c /etc/bacula/bacula-fd.conf).

Anyone seen that before? Any workaround for that? It is MacOS X Client 
10.5.5 Intel (uname -a outputs Darwin external-fd.domain.de 9.5.0 
Darwin Kernel Version 9.5.0: Wed Sep  3 11:29:43 PDT 2008; 
root:xnu-1228.7.58~1/RELEASE_I386 i386).

Any help or hint would be greatly appreciated!

Dirk




Re: [Bacula-users] Bacula via NATed connection and Bacula docs - partly solved. Launchd!

2010-01-28 Thread Dan Langille
Dirk H. Schulz wrote:
 The more I look into it, the more it gets weird.
 
 Gavin McCullagh schrieb:
 On Wed, 27 Jan 2010, Dirk H. Schulz wrote:

   
 Telnetting from external-fd to server-sd using the above mentionened FQDN
 and the port of the storage daemon (telnet storage.server.sd 9103)
 outputs exactly the same as telnetting internally to that port.  Afaik,
 that means: bacula-fd on the external client should be able to connect to
 bacula-sd on the internal server.

 But it does not. Running a backup job for this client the director is 
 quite a long time waiting for Client ... to connect to Storage ... and 
 eventually gives up.
 
 In this instance, I would be inclined to start a tcpdump like that below on
 both the -fd and -sd, start your backup and see where exactly the -fd tries
 to connect to.
  tcpdump -ni ethX tcp port 9103

 The first question I suppose is to see what IP address the -fd is actually
 using to connect.  The second is does the tcp handshake happen correctly
 and if so what happens then.  Perhaps the -fd is connecting to the wrong
 IP, or it could be a firewall issue, or something else...?  
   
 First: I made the test with all firewalls on the way shut down (except 
 the one doing NAT) to avoid any issues from there.
 Then I made a similar test with a different client-fd in the same public 
 subnet, and it worked.
 I have thoroughly compared the configuration of these two clients (both 
 bacula-fd.conf and bacula-dir.conf).
 
 Still nothing works. And here is what tcpdump and bacula-dir output:
 
 external-fd:~ root# tcpdump -ni en1 portrange 9101-9103
 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
 listening on en1, link-type EN10MB (Ethernet), capture size 96 bytes
 08:01:39.346580 IP 1.2.3.4.32930  40.50.60.70.9102: S 
 1415949915:1415949915(0) win 5840 mss 1452,sackOK,timestamp 939681199 
 0,nop,wscale 7
 08:01:39.346647 IP 40.50.60.70.9102  1.2.3.4.32930: S 
 221080258:221080258(0) ack 1415949916 win 65535 mss 1460,nop,wscale 
 3,nop,nop,timestamp 213499348 939681199,sackOK,eol
 08:01:39.387055 IP 1.2.3.4.32930  40.50.60.70.9102: . ack 1 win 46 
 nop,nop,timestamp 939681241 213499348
 08:01:39.387073 IP 40.50.60.70.9102  1.2.3.4.32930: . ack 1 win 65535 
 nop,nop,timestamp 213499348 939681241
 08:01:39.391051 IP 1.2.3.4.32930  40.50.60.70.9102: P 1:51(50) ack 1 
 win 46 nop,nop,timestamp 939681244 213499348
 08:01:39.391065 IP 40.50.60.70.9102  1.2.3.4.32930: . ack 51 win 
 65535 nop,nop,timestamp 213499348 939681244
 08:06:22.221818 IP 40.50.60.70.9102  1.2.3.4.32930: . ack 51 win 0
 08:06:22.221853 IP 40.50.60.70.9102  1.2.3.4.32930: . ack 51 win 
 65535 nop,nop,timestamp 213502176 939681244
 08:06:22.262232 IP 1.2.3.4.32930  40.50.60.70.9102: . ack 1 win 46 
 nop,nop,timestamp 939964161 213499348
 08:11:07.236737 IP 40.50.60.70.9102  1.2.3.4.32930: . ack 51 win 0
 08:11:07.236780 IP 40.50.60.70.9102  1.2.3.4.32930: . ack 51 win 
 65535 nop,nop,timestamp 213505026 939964161
 08:11:07.279418 IP 1.2.3.4.32930  40.50.60.70.9102: . ack 1 win 46 
 nop,nop,timestamp 940249226 213502176
 08:11:44.501513 IP 1.2.3.4.32930  40.50.60.70.9102: F 51:51(0) ack 1 
 win 46 nop,nop,timestamp 940286454 213505026
 08:11:44.501542 IP 40.50.60.70.9102  1.2.3.4.32930: . ack 52 win 
 65535 nop,nop,timestamp 213505399 940286454

 All the while bacula-dir claims waiting for Client external-fd to 
 connect to Storage LTO2 there is not one attempt at connecting to SD 
 from this client!

If this is true, then something is wrong.

Either you're looking in the wrong place for the traffic, or the traffic 
is going somewhere else.  Consider the possibility that what you are 
expecting does not match up with what the Bacula components have been 
told to do.  Verify and double check all data related to that FD and SD. 
Go through the .conf files and see what hostnames are being used. 
Verify on both the SD and the FD that those hostnames resolve to the 
correct IP addresse.  Verify that each can talk to the other (telnet 
IPaddress PORT).


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bacula via NATed connection and Bacula docs - partly solved. Launchd!

2010-01-28 Thread Chris Shelton
Dirk,

On Thu, Jan 28, 2010 at 4:35 AM, Dirk H. Schulz
dirk.sch...@kinzesberg.de wrote:
 And then - just from pure desperation - I started bacula-fd manually
 instead of via launchd (with the same parameters launchd is given) - and
 now it works!

 Somehow communication does not work correctly if bacula-fd is started
 via launchd (/sbin/bacula-fd -f -c /etc/bacula/bacula-fd.conf).

 Anyone seen that before? Any workaround for that? It is MacOS X Client
 10.5.5 Intel (uname -a outputs Darwin external-fd.domain.de 9.5.0
 Darwin Kernel Version 9.5.0: Wed Sep  3 11:29:43 PDT 2008;
 root:xnu-1228.7.58~1/RELEASE_I386 i386).


 Okay, after cooling down again I started searching.

 In the original bacula plist file for launchd from bacula sources there
 is this entry:
     keySockets/key
     dict
         keyListeners/key
         array
             dict
                 keySockServiceName/key
                 stringbacula-fd/string
             /dict
         /array
     /dict
 That tells launchd to realize an on demand run like inetd does in
 historical unixes. I did not find anything in Apple's documentation on
 why this should prevent communication from bacula-fd outside to
 bacula-sd, but it does.
 Without this entry (and with added KeepAlive and RunAtLoad entries) it
 works fine.

I've been using a simpler plist file for bacula backups of a couple of
OS X systems for a few years, without any issue.  The plist file is
simple:

sh-3.2# more /Library/LaunchDaemons/bacula-fd.plist
?xml version=1.0 encoding=UTF-8?
!DOCTYPE plist PUBLIC -//Apple//DTD PLIST 1.0//EN
http://www.apple.com/DTDs/PropertyList-1.0.dtd;
plist version=1.0
dict
keyLabel/key
stringorg.bacula.bacula-fd/string
keyProgramArguments/key
array
string/usr/local/sbin/bacula-fd/string
string-f/string
string-c/string
string/usr/local/etc/bacula-fd.conf/string
/array
keyRunAtLoad/key
true/
keyUserName/key
stringroot/string
/dict
/plist

The bacula-fd stays running all the time, but that seems to be the
standard setup, rather than having it started when needed via the
launchd equivalent of xinetd.

chris

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bacula via NATed connection and Bacula docs - partly solved. Launchd!

2010-01-28 Thread Dirk H. Schulz
Chris Shelton schrieb:
 Dirk,

 On Thu, Jan 28, 2010 at 4:35 AM, Dirk H. Schulz
 dirk.sch...@kinzesberg.de wrote:
   
 And then - just from pure desperation - I started bacula-fd manually
 instead of via launchd (with the same parameters launchd is given) - and
 now it works!

 Somehow communication does not work correctly if bacula-fd is started
 via launchd (/sbin/bacula-fd -f -c /etc/bacula/bacula-fd.conf).

 Anyone seen that before? Any workaround for that? It is MacOS X Client
 10.5.5 Intel (uname -a outputs Darwin external-fd.domain.de 9.5.0
 Darwin Kernel Version 9.5.0: Wed Sep  3 11:29:43 PDT 2008;
 root:xnu-1228.7.58~1/RELEASE_I386 i386).


   
 Okay, after cooling down again I started searching.

 In the original bacula plist file for launchd from bacula sources there
 is this entry:
 
 keySockets/key
 dict
 keyListeners/key
 array
 dict
 keySockServiceName/key
 stringbacula-fd/string
 /dict
 /array
 /dict
   
 That tells launchd to realize an on demand run like inetd does in
 historical unixes. I did not find anything in Apple's documentation on
 why this should prevent communication from bacula-fd outside to
 bacula-sd, but it does.
 Without this entry (and with added KeepAlive and RunAtLoad entries) it
 works fine.
 

 I've been using a simpler plist file for bacula backups of a couple of
 OS X systems for a few years, without any issue.  The plist file is
 simple:

 sh-3.2# more /Library/LaunchDaemons/bacula-fd.plist
 ?xml version=1.0 encoding=UTF-8?
 !DOCTYPE plist PUBLIC -//Apple//DTD PLIST 1.0//EN
 http://www.apple.com/DTDs/PropertyList-1.0.dtd;
 plist version=1.0
 dict
 keyLabel/key
 stringorg.bacula.bacula-fd/string
 keyProgramArguments/key
 array
 string/usr/local/sbin/bacula-fd/string
 string-f/string
 string-c/string
 string/usr/local/etc/bacula-fd.conf/string
 /array
 keyRunAtLoad/key
 true/
 keyUserName/key
 stringroot/string
 /dict
 /plist

 The bacula-fd stays running all the time, but that seems to be the
 standard setup, rather than having it started when needed via the
 launchd equivalent of xinetd.
   
That is what I thought and what I have configured now, too, but I wonder 
why the developers implement a configuration that cannot work without 
correction.
I am just chasing for the point of view in which this makes sense in 
case there is something to learn. :-)

Thank you,

Dirk


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Bacula via NATed connection and Bacula docs

2010-01-27 Thread Gavin McCullagh
On Wed, 27 Jan 2010, Dirk H. Schulz wrote:

 Telnetting from external-fd to server-sd using the above mentionened FQDN
 and the port of the storage daemon (telnet storage.server.sd 9103)
 outputs exactly the same as telnetting internally to that port.  Afaik,
 that means: bacula-fd on the external client should be able to connect to
 bacula-sd on the internal server.
 
 But it does not. Running a backup job for this client the director is 
 quite a long time waiting for Client ... to connect to Storage ... and 
 eventually gives up.

In this instance, I would be inclined to start a tcpdump like that below on
both the -fd and -sd, start your backup and see where exactly the -fd tries
to connect to.
tcpdump -ni ethX tcp port 9103

The first question I suppose is to see what IP address the -fd is actually
using to connect.  The second is does the tcp handshake happen correctly
and if so what happens then.  Perhaps the -fd is connecting to the wrong
IP, or it could be a firewall issue, or something else...?  

Gavin


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users