Re: how to debug 'starting network' hangs

2009-06-18 Thread David Newman
On 6/18/09 4:36 AM, Tom wrote:
>> # start openvpn
>> #
>> if [ -x /usr/local/sbin/openvpn ]; then
>> /usr/local/sbin/openvpn --config /opt/openvpn-2.0/server.conf &&
>> echo 'opening openvpn server...' &
>> else
>> echo 'ERROR: cannot start openvpn; file /usr/local/sbin/openvpn is missing.'
>> fi
> 
> Don't start openvpn there. Stick it in your /etc/hostname.tunX file like so:
> 
> up
> !/usr/local/sbin/openvpn --daemon openvpn --config 
> /opt/openvpn-2.0/server.conf
> 
> 
> The reason being, is when OpenVPN starts, it destroys then recreates
> the tun interface, which
> makes pf throw a wobbler.
> 
> I ran into this problem too. It also made other weird stuff happen,
> like pfctl -vsq showing
> an invalid file descriptor. Sticking it in the hostname.tun* file
> sorts that problem out.

Thanks much. This never worked from rc.local, even though the package
install says to do it that way. I always had to start OpenVPN manually
after bootup.

There is a race condition with the hostname.tun0 method when boxes also
use carp and pfsync. After a reboot, a box initially comes up with carp
interfaces in BACKUP state. The carp interfaces will quickly transition
to MASTER state if they have the lowest advskew value, but before then
hostname.tun0 has already tried and failed to get OpenVPN running.

Here is the error log from OpenVPN:

Thu Jun 18 13:44:34 2009 OpenVPN 2.1_rc15 i386-unknown-openbsd4.5 [SSL]
[LZO1] built on Mar  1 2009
openvpn: writing to routing socket: No such process
Thu Jun 18 13:44:34 2009 NOTE: OpenVPN 2.1 requires '--script-security
2' or higher to call user-defined scripts or executables
Thu Jun 18 13:44:34 2009 Diffie-Hellman initialized with 2048 bit key
Thu Jun 18 13:44:34 2009 TLS-Auth MTU parms [ L:1558 D:138 EF:38 EB:0
ET:0 EL:0 ]
Thu Jun 18 13:44:34 2009 TCP/UDP: Socket bind failed on local address
10.0.0.1:53962: Can't assign requested address
Thu Jun 18 13:44:34 2009 Exiting

The (obfuscated) address of 10.0.0.1 is a problem -- that's a carp
address, and at the time this is run the other carp/pfsync box owns it
because it's in MASTER state.

Not sure about the routing socket error on the second line.

If I manually run 'sh /etc/netstart tun0' after the carp interfaces come
up as MASTER all is good, but that's no better than starting manually as
before.

Thanks in advance for any clues on getting OpenVPN and carp/pfsync to
play nice together.


> 
> (I think this should be documented somewhere, maybe in the OpenBSD FAQ)

It certainly belongs in the package documentation, which currently
suggests adding startup lines into rc.local. I can write this up once I
get it working with carp/pfsyc.

dn





> 
> Tom



Re: how to debug 'starting network' hangs

2009-06-18 Thread Tom
> # start openvpn
> #
> if [ -x /usr/local/sbin/openvpn ]; then
> /usr/local/sbin/openvpn --config /opt/openvpn-2.0/server.conf &&
> echo 'opening openvpn server...' &
> else
> echo 'ERROR: cannot start openvpn; file /usr/local/sbin/openvpn is missing.'
> fi

Don't start openvpn there. Stick it in your /etc/hostname.tunX file like so:

up
!/usr/local/sbin/openvpn --daemon openvpn --config /opt/openvpn-2.0/server.conf


The reason being, is when OpenVPN starts, it destroys then recreates
the tun interface, which
makes pf throw a wobbler.

I ran into this problem too. It also made other weird stuff happen,
like pfctl -vsq showing
an invalid file descriptor. Sticking it in the hostname.tun* file
sorts that problem out.

(I think this should be documented somewhere, maybe in the OpenBSD FAQ)

Tom



Re: how to debug 'starting network' hangs

2009-06-17 Thread Jason Dixon
On Wed, Jun 17, 2009 at 11:25:51AM -0700, David Newman wrote:
> On 6/16/09 10:07 PM, Jason Dixon wrote:
> 
> > I would suggest booting into single-user and using netstart for each of
> > the physical and carp interfaces until you find out where your
> > misconfiguration is.  Set it all up manually, document it, then use
> > hostname.* to properly bring up your interfaces and routes.  Get rid of
> > that junk in rc.local.
> 
> Sweet! With proper hostname.* files there are no more hangs. Thanks for
> the pointer on what to fix.

Cool beans.
 
> One other question, not covered in the FAQ: Is rc.local the proper place
> for adding a static route and dhcrelay commands? If not, where do these
> belong?

Add your static routes in your hostname.if files.  Use the "!command-line"
syntax as described in hostname.if(5).

The dhcrelay stuff is probably fine in rc.local.  Typically you enable
it in rc.conf.local, but I think that only works for a single
invocation.

-- 
Jason Dixon
DixonGroup Consulting
http://www.dixongroup.net/



Re: how to debug 'starting network' hangs

2009-06-17 Thread David Newman
On 6/16/09 10:07 PM, Jason Dixon wrote:

> I would suggest booting into single-user and using netstart for each of
> the physical and carp interfaces until you find out where your
> misconfiguration is.  Set it all up manually, document it, then use
> hostname.* to properly bring up your interfaces and routes.  Get rid of
> that junk in rc.local.

Sweet! With proper hostname.* files there are no more hangs. Thanks for
the pointer on what to fix.

One other question, not covered in the FAQ: Is rc.local the proper place
for adding a static route and dhcrelay commands? If not, where do these
belong?

thanks again

dn



Re: how to debug 'starting network' hangs

2009-06-16 Thread Jason Dixon
On Tue, Jun 16, 2009 at 09:42:06PM -0700, David Newman wrote:
> On 6/16/09 4:36 PM, Jason Dixon wrote:
> > 
> > Why are you starting your network interfaces and adding routes in
> > rc.local?  
> 
> I maintain these systems, but did not do the initial setup or
> configuration.
> 
> > Have you read the FAQ to learn how OpenBSD networking is
> > configured?
> 
> Yes, and read the ifconfig and rc and pf.conf manpages and searched the
> misc mailing list on marc.info. I saw info on pf and carp and pfsync and
>  VLANs, but not on how they work together.
> 
> dn
> 
> hostname.bge0 -- unprotected physical interface
> inet 666.1.2.188 255.255.255.192 NONE
> 
> hostname.bge1 -- protected physical interface
> inet 10.0.127.1 255.255.255.0 NONE
> 
> hostname.carp1 -- unprotected logical interface
> inet 666.1.2.130 255.255.255.192 666.1.2.191 vhid 202 carpdev bge0
> advskew 1 pass sekret123
> 
> hostname.em0 -- pfsync physical interface
> inet 192.18.0.1 255.255.255.0 NONE media autoselect
> 
> hostname pfsync0 -- pfsync logical interface
> up syncdev em0

Honestly, I don't trust much of what you've pasted.  You're using
invalid IPv4 addresses and have hostname.carp1 on 2 lines (is that
wrapped?).  You also don't list a carp interface for bge1.

I would suggest booting into single-user and using netstart for each of
the physical and carp interfaces until you find out where your
misconfiguration is.  Set it all up manually, document it, then use
hostname.* to properly bring up your interfaces and routes.  Get rid of
that junk in rc.local.

Example:
# sh /etc/netstart bge0
# sh /etc/netstart bge1
# sh /etc/netstart carp1
# sh /etc/netstart em0
# sh /etc/netstart pfsync0

Thanks,

-- 
Jason Dixon
DixonGroup Consulting
http://www.dixongroup.net/



Re: how to debug 'starting network' hangs

2009-06-16 Thread David Newman
On 6/16/09 4:36 PM, Jason Dixon wrote:
> On Tue, Jun 16, 2009 at 03:47:47PM -0700, David Newman wrote:
>> Running 4.5/i386 on a pair of firewalls using pf and carp and pfsync
>> (and also multiple VLANs).
>>
>> After a reboot, either system will hang at 'starting network' until
>> pressing Ctrl-C at the console. (By 'hang' I means no action for at
>> least 60 minutes; I have not waited longer than that.)
>>
>> Initially I thought this was because of a hostname resolution problem,
>> but pf.conf and resolv.conf contain only IP addresses, not hostnames.
>>
>> Also, 'pfctl -f /etc/pf.conf' runs OK from the console. Same deal with
>> 'sh /etc/netstart' and the OpenVPN stuff in rc.local, pasted below.
>>
>> Presumably something is broken after /etc/rc says 'starting network',
>> but what? I've read on this list one should never edit /etc/rc.
> 
> You've given us no information about your hostname.* files

With addresses and passwords obfuscated, these are pasted below.

.  How could
> we possibly help diagnose problems starting your network?
>  
>> ps. FWIW I've pasted the contents of /etc/rc.local below. Addresses and
>> passwords have been obfuscated.
> 
> Why are you starting your network interfaces and adding routes in
> rc.local?  

I maintain these systems, but did not do the initial setup or
configuration.

> Have you read the FAQ to learn how OpenBSD networking is
> configured?

Yes, and read the ifconfig and rc and pf.conf manpages and searched the
misc mailing list on marc.info. I saw info on pf and carp and pfsync and
 VLANs, but not on how they work together.

dn

hostname.bge0 -- unprotected physical interface
inet 666.1.2.188 255.255.255.192 NONE

hostname.bge1 -- protected physical interface
inet 10.0.127.1 255.255.255.0 NONE

hostname.carp1 -- unprotected logical interface
inet 666.1.2.130 255.255.255.192 666.1.2.191 vhid 202 carpdev bge0
advskew 1 pass sekret123

hostname.em0 -- pfsync physical interface
inet 192.18.0.1 255.255.255.0 NONE media autoselect

hostname pfsync0 -- pfsync logical interface
up syncdev em0

and here is /etc/rc.local again. I do not know why the consultant who
set up these machines put some carp interfaces here rather than in
hostname files.

echo -n 'starting local daemons:'

# Add your local startup actions here.

echo '.'

# VLAN config
ifconfig vlan10 10.0.0.2 netmask 255.255.255.0 vlan 10 vlandev bge1
ifconfig vlan11 10.0.1.2 netmask 255.255.255.0 vlan 11 vlandev bge1
ifconfig vlan12 10.0.2.2 netmask 255.255.255.0 vlan 12 vlandev bge1
ifconfig vlan13 10.0.3.2 netmask 255.255.255.0 vlan 13 vlandev bge1
ifconfig vlan14 10.0.128.2 netmask 255.255.255.0 vlan 14 vlandev bge1

# 07/16/06 CARP config is here to avoid hacking netstart
ifconfig carp2 inet 10.0.0.1 netmask 255.255.255.0 broadcast 10.0.0.255 vhid
 203 advskew 1 pass seekret123 carpdev vlan10
ifconfig carp3 inet 10.0.1.1 netmask 255.255.255.0 broadcast 10.0.1.255 vhid
 204 advskew 1 pass seekret123 carpdev vlan11
ifconfig carp4 inet 10.0.2.1 netmask 255.255.255.0 broadcast 10.0.2.255 vhid
 205 advskew 1 pass seekret123 carpdev vlan12
ifconfig carp5 inet 10.0.3.1 netmask 255.255.255.0 broadcast 10.0.3.255 vhid
 206 advskew 1 pass seekret123 carpdev vlan13
ifconfig carp6 inet 10.0.128.1 netmask 255.255.255.0 broadcast 10.0.128.255
vhid 207 advskew 1 pass seekret123 carpdev vlan14

# sample static routes
/sbin/route add -net 10.0.0.0/16 10.0.1.158
# to do -- add other static routes

# DHCP helper addresses
dhcrelay -i vlan10 10.0.0.103
dhcrelay -i vlan11 10.0.0.103
dhcrelay -i vlan12 10.0.0.103
dhcrelay -i vlan13 10.0.0.103
dhcrelay -i vlan14 10.0.0.103

# start openvpn
#
if [ -x /usr/local/sbin/openvpn ]; then
/usr/local/sbin/openvpn --config /opt/openvpn-2.0/server.conf &&
echo 'opening openvpn server...' &
else
echo 'ERROR: cannot start openvpn; file /usr/local/sbin/openvpn is missing.'
fi

# start bacula
if [ -x /usr/local/libexec/bacula/bacula-ctl-fd ]; then
  /usr/local/libexec/bacula/bacula-ctl-fd start
  echo -n ' bacula-fd'
fi

# start net-snmp
if [ -x /usr/local/sbin/snmpd ]; then
echo -n ' snmpd';   /usr/local/sbin/snmpd
fi

# start apcupsd
# Start the UPS daemon. Do not remove the 'TAG_APCUPSD' text
if [ -x /etc/rc.apcupsd ]; then   # TAG_APCUPSD
  /etc/rc.apcupsd start   # TAG_APCUPSD
fi# TAG_APCUPSD



Re: how to debug 'starting network' hangs

2009-06-16 Thread Jason Dixon
On Tue, Jun 16, 2009 at 03:47:47PM -0700, David Newman wrote:
> Running 4.5/i386 on a pair of firewalls using pf and carp and pfsync
> (and also multiple VLANs).
> 
> After a reboot, either system will hang at 'starting network' until
> pressing Ctrl-C at the console. (By 'hang' I means no action for at
> least 60 minutes; I have not waited longer than that.)
> 
> Initially I thought this was because of a hostname resolution problem,
> but pf.conf and resolv.conf contain only IP addresses, not hostnames.
> 
> Also, 'pfctl -f /etc/pf.conf' runs OK from the console. Same deal with
> 'sh /etc/netstart' and the OpenVPN stuff in rc.local, pasted below.
> 
> Presumably something is broken after /etc/rc says 'starting network',
> but what? I've read on this list one should never edit /etc/rc.

You've given us no information about your hostname.* files.  How could
we possibly help diagnose problems starting your network?
 
> ps. FWIW I've pasted the contents of /etc/rc.local below. Addresses and
> passwords have been obfuscated.

Why are you starting your network interfaces and adding routes in
rc.local?  Have you read the FAQ to learn how OpenBSD networking is
configured?

-- 
Jason Dixon
DixonGroup Consulting
http://www.dixongroup.net/



how to debug 'starting network' hangs

2009-06-16 Thread David Newman
Running 4.5/i386 on a pair of firewalls using pf and carp and pfsync
(and also multiple VLANs).

After a reboot, either system will hang at 'starting network' until
pressing Ctrl-C at the console. (By 'hang' I means no action for at
least 60 minutes; I have not waited longer than that.)

Initially I thought this was because of a hostname resolution problem,
but pf.conf and resolv.conf contain only IP addresses, not hostnames.

Also, 'pfctl -f /etc/pf.conf' runs OK from the console. Same deal with
'sh /etc/netstart' and the OpenVPN stuff in rc.local, pasted below.

Presumably something is broken after /etc/rc says 'starting network',
but what? I've read on this list one should never edit /etc/rc.

Thanks in advance for suggested techniques for debugging and fixing the
hang behavior.

dn

ps. FWIW I've pasted the contents of /etc/rc.local below. Addresses and
passwords have been obfuscated.

echo -n 'starting local daemons:'

# Add your local startup actions here.

echo '.'

# VLAN config
ifconfig vlan10 10.0.0.2 netmask 255.255.255.0 vlan 10 vlandev bge1
ifconfig vlan11 10.0.1.2 netmask 255.255.255.0 vlan 11 vlandev bge1
ifconfig vlan12 10.0.2.2 netmask 255.255.255.0 vlan 12 vlandev bge1
ifconfig vlan13 10.0.3.2 netmask 255.255.255.0 vlan 13 vlandev bge1
ifconfig vlan14 10.0.128.2 netmask 255.255.255.0 vlan 14 vlandev bge1

# 07/16/06 CARP config is here to avoid hacking netstart
ifconfig carp2 inet 10.0.0.1 netmask 255.255.255.0 broadcast 10.0.0.255 vhid
 203 advskew 1 pass seekret123 carpdev vlan10
ifconfig carp3 inet 10.0.1.1 netmask 255.255.255.0 broadcast 10.0.1.255 vhid
 204 advskew 1 pass seekret123 carpdev vlan11
ifconfig carp4 inet 10.0.2.1 netmask 255.255.255.0 broadcast 10.0.2.255 vhid
 205 advskew 1 pass seekret123 carpdev vlan12
ifconfig carp5 inet 10.0.3.1 netmask 255.255.255.0 broadcast 10.0.3.255 vhid
 206 advskew 1 pass seekret123 carpdev vlan13
ifconfig carp6 inet 10.0.128.1 netmask 255.255.255.0 broadcast 10.0.128.255
vhid 207 advskew 1 pass seekret123 carpdev vlan14

# sample static routes
/sbin/route add -net 10.0.0.0/16 10.0.1.158
# to do -- add other static routes

# DHCP helper addresses
dhcrelay -i vlan10 10.0.0.103
dhcrelay -i vlan11 10.0.0.103
dhcrelay -i vlan12 10.0.0.103
dhcrelay -i vlan13 10.0.0.103
dhcrelay -i vlan14 10.0.0.103

# start openvpn
#
if [ -x /usr/local/sbin/openvpn ]; then
/usr/local/sbin/openvpn --config /opt/openvpn-2.0/server.conf &&
echo 'opening openvpn server...' &
else
echo 'ERROR: cannot start openvpn; file /usr/local/sbin/openvpn is missing.'
fi

# start bacula
if [ -x /usr/local/libexec/bacula/bacula-ctl-fd ]; then
  /usr/local/libexec/bacula/bacula-ctl-fd start
  echo -n ' bacula-fd'
fi

# start net-snmp
if [ -x /usr/local/sbin/snmpd ]; then
echo -n ' snmpd';   /usr/local/sbin/snmpd
fi

# start apcupsd
# Start the UPS daemon. Do not remove the 'TAG_APCUPSD' text
if [ -x /etc/rc.apcupsd ]; then   # TAG_APCUPSD
  /etc/rc.apcupsd start   # TAG_APCUPSD
fi# TAG_APCUPSD