Re: carp master <-> backup problem

2009-10-29 Thread Georg Kahest
Hello i noticed that my netstat -s -p carp shows 1068 discarded for bad 
authentication
My carp works okey otherwise, but should i worry about it ? how to debug 
it ?




Bryan Irvine wrote:

VVV
  

  372 discarded for unknown vhid




I know someone else already pointed it out but this is worth drawing
your attention to as well.

-B




Re: carp master <-> backup problem

2009-10-28 Thread Bryan Irvine
VVV
>   372 discarded for unknown vhid


I know someone else already pointed it out but this is worth drawing
your attention to as well.

-B



Re: carp master <-> backup problem

2009-10-28 Thread Scott McEachern

Bryan Irvine wrote:

I do believe preempt should be 1 on both servers. Let the advskew
handle which one is primary.

What do you see for output of 'netstat -s -p carp' and 'netstat -s -p pfsync'

-B

  
I tried it with both servers set to preempt=1, with the same results, 
but to double check I did it again.  The results are identical to 
everything I posted previous, except (on the secondary server):


$ sysctl net.inet.carp
net.inet.carp.allow=1
net.inet.carp.preempt=1
net.inet.carp.log=2

Per your request:

(on the primary:)
$  netstat -s -p carp
carp:
   226 packets received (IPv4)
   0 packets received (IPv6)
   0 packets discarded for bad interface
   0 packets discarded for wrong TTL
   0 packets shorter than header
   0 discarded for bad checksums
   0 discarded packets with a bad version
   0 discarded because packet too short
   0 discarded for bad authentication
   226 discarded for unknown vhid
   0 discarded because of a bad address list
   387 packets sent (IPv4)
   0 packets sent (IPv6)
   0 send failed due to mbuf memory error
   1 transition to master

(on the secondary:)
$  netstat -s -p carp
carp:
   335 packets received (IPv4)
   0 packets received (IPv6)
   0 packets discarded for bad interface
   0 packets discarded for wrong TTL
   0 packets shorter than header
   0 discarded for bad checksums
   0 discarded packets with a bad version
   0 discarded because packet too short
   0 discarded for bad authentication
   335 discarded for unknown vhid
   0 discarded because of a bad address list
   236 packets sent (IPv4)
   0 packets sent (IPv6)
   0 send failed due to mbuf memory error
   1 transition to master

This was done after a clean reboot (both) and my accessing the site from 
an external shell account I have (using lynx).  The secondary still 
responds first, and when it is taken offline (halt -p), the primary does 
not take over (no answer).  The primary only takes over normal duties 
when the hostname.carp0 file has been renamed on the secondary, the 
secondary has actually been rebooted and sh /etc/netstart has been run 
on the primary.  After the secondary was taken offline, and sh 
/etc/netstart run on the primary, I accessed the site again (the primary 
is then the only carp node), and did this: (from the primary)


$ netstat -s -p carp
carp:
   372 packets received (IPv4)
   0 packets received (IPv6)
   0 packets discarded for bad interface
   0 packets discarded for wrong TTL
   0 packets shorter than header
   0 discarded for bad checksums
   0 discarded packets with a bad version
   0 discarded because packet too short
   0 discarded for bad authentication
   372 discarded for unknown vhid
   0 discarded because of a bad address list
   704 packets sent (IPv4)
   0 packets sent (IPv6)
   0 send failed due to mbuf memory error
   1 transition to master

As for output regarding pfsync, all values are zero because I do not use 
pfsync.  It is a single firewall with two web servers internally, not a 
redundant firewall situation.  No changes have been made to the firewall 
at all.


I'm at my wits end for why this doesn't work.  It *must* be something 
wrong with my config, as I just don't believe it's a "bug" in carp.  
This config is practically straight out of the FAQ so I'm at a total 
loss. :(


FWIW, the pf.conf on the firewall uses these values (which normally work 
fine):

(...)
gw_ext=$ext_ip4 <-- my external IP addy for that web site, I have 5 IPs
gw_int="192.168.0.9" <-- the carp node, or when not using carp, the 
primary web server
#gw_int="192.168.0.19"  <-- for when I manually switch to the secondary 
server

gw_ports="{ 80, 443 }"
int0_if="xl0"
tcp_flags="flags S/SA modulate state"
(...)
not_private="{ \
   !0.0.0.0/8, \
   !10.0.0.0/8, \
   !127.0.0.0/8, \
   !169.254.0.0/16, \
   !172.16.0.0/12, \
   !192.8.2.0/24, \
   !192.168.0.0/16, \
   !240.0.0.0/4, \
   !255.255.255.255/32 \
}"
(...)
rdr on $ext_if proto tcp from $not_private to $gw_ext port \
   $gw_ports -> $gw_int
(...)
pass in log quick on $ext_if inet proto tcp from $not_private to $gw_int \
   port $gw_ports flags S/SA synproxy state
(...)
pass out quick on $int0_if proto tcp from $not_private to $gw_int \
   port $gw_ports $tcp_flags

The firewall config has worked fine and hasn't been changed in ages, but 
I can't help wonder if something there is screwing up carp.  Redoing and 
simplifying the fw rules (using tags) is next on my todo list, but I 
figured I'd get carp working first before changing a "known good" fw 
config and adding another change to the mix.


--

-RSM

http://www.erratic.ca



Re: carp master <-> backup problem

2009-10-28 Thread Scott McEachern

Peter Hessler wrote:

On 2009 Oct 28 (Wed) at 01:55:40 -0400 (-0400), Scott wrote:
:$ cat /etc/hostname.carp0:
:inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 1 carpdev fxp0
-snip-
:$ cat /etc/hostname.carp0
:inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 2 advbase 1 advskew
:100 carpdev xl0

The vhids need to be identical.

  
And therein lies the solution.  I misunderstood the documents and 
thought that each carp node had a unique vhid.


I've since tested with both online, the master offline, then put back, 
etc. and all works *perfectly* fine now!  I knew it was my bad.


Thank-you very much for pointing out my error, and to the others that 
helped out.  I'm sorry for the noise.


BTW: I forgot to mention this, but thanks to all the folks involved with 
4.6.  The CDs arrived just outside of Toronto on 19 Oct (Monday last 
week.)  :) :)


--

-RSM

http://www.erratic.ca



Re: carp master <-> backup problem

2009-10-28 Thread Peter Hessler
On 2009 Oct 28 (Wed) at 01:55:40 -0400 (-0400), Scott wrote:
:$ cat /etc/hostname.carp0:
:inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 1 carpdev fxp0
-snip-
:$ cat /etc/hostname.carp0
:inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 2 advbase 1 advskew
:100 carpdev xl0

The vhids need to be identical.


-- 
Legalize free-enterprise murder: why should governments have all the
fun?



Re: carp master <-> backup problem

2009-10-28 Thread Michiel van Baak
On 01:55, Wed 28 Oct 09, Scott wrote:
> I must be missing something in my config, and I'd appreciate it if my
> blunder could be pointed out to me.
>

[snip]

Do you have pf enabled ?
If so, make sure you allow carp traffic on the physical interface that
runs carp.
-- 

Michiel van Baak
mich...@vanbaak.eu
http://michiel.vanbaak.eu
GnuPG key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x71C946BD

"Why is it drug addicts and computer aficionados are both called users?"



Re: carp master <-> backup problem

2009-10-28 Thread Bryan Irvine
On Tue, Oct 27, 2009 at 10:55 PM, Scott  wrote:
> I must be missing something in my config, and I'd appreciate it if my
> blunder could be pointed out to me.
>
> I have two web servers behind a firewall (all machines are running
> 4.6-stable, generic kernel).  The firewall has rdr & pass rules to both web
> servers, with one commented out at a time.  I change it manually when I
want
> to switch them.  This same setup has been working fine since 4.4.
>  Generally, pf routes web traffic to the primary web server (192.168.0.9)
> but sometimes I use it's twin at 192.168.0.19.
>
> Today I decided to try using carp to *not* load balance, but use the
> primary and have the secondary kick in when I have the primary offline
> for maintenance instead of me changing the pf rule by hand.  Simple
> enough.  I read the man pages for carp and ifconfig, and read the
> example in the FAQ.  (This will eventually be load balanced in the
> future if I can get MySQL clustering to work on OpenBSD... haven't tried
> that yet.)
>
> The problem is that when I access my site from an external account, my
> primary never gets used, the secondary takes all connections, and to make
it
> worse, if the secondary (which is being used first) is taken offline, the
> primary doesn't even get touched.  I have to delete the carp i/f on the
> secondary and reboot the primary for web access to go back to normal.
>
> On the primary web server:
>
> $ sysctl net.inet.carp
> net.inet.carp.allow=1
> net.inet.carp.preempt=1
> net.inet.carp.log=2
>
> $ cat /etc/hostname.carp0:
> inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 1 carpdev fxp0
>
> $ cat /etc/hostname.fxp0
> inet 192.168.0.2 255.255.255.0 NONE media 100baseTX mediaopt full-duplex
> inet alias 192.168.0.9 255.255.255.0
> inet alias 192.168.0.10 255.255.255.0
> inet alias 192.168.0.11 255.255.255.0
> inet alias 192.168.0.12 255.255.255.0
> inet alias 192.168.0.13 255.255.255.0
>
> $ ifconfig carp0
> carp0: flags=8843 mtu 1500
>   lladdr 00:00:5e:00:01:01
>   priority: 0
>   carp: MASTER carpdev fxp0 vhid 1 advbase 1 advskew 0
>   groups: carp
>   inet6 fe80::200:5eff:fe00:101%carp0 prefixlen 64 scopeid 0x5
>   inet 192.168.0.9 netmask 0xff00 broadcast 192.168.0.255
>
>
> On the secondary web server:
>
> $ sysctl net.inet.carp
> net.inet.carp.allow=1
> net.inet.carp.preempt=0
> net.inet.carp.log=2
>
> $ cat /etc/hostname.carp0
> inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 2 advbase 1 advskew
> 100 carpdev xl0
>
> $ cat /etc/hostname.xl0
> inet 192.168.0.3 255.255.255.0 NONE media 100baseTX mediaopt full-duplex
> inet alias 192.168.0.20 255.255.255.0
> inet alias 192.168.0.21 255.255.255.0
> inet alias 192.168.0.22 255.255.255.0
> inet alias 192.168.0.23 255.255.255.0
>
> $ ifconfig carp0
> carp0: flags=8843 mtu 1500
>   lladdr 00:00:5e:00:01:02
>   priority: 0
>   carp: MASTER carpdev xl0 vhid 2 advbase 1 advskew 100
>   groups: carp
>   inet6 fe80::200:5eff:fe00:102%carp0 prefixlen 64 scopeid 0x5
>   inet 192.168.0.9 netmask 0xff00 broadcast 192.168.0.255
>
>
> I have tried making slight changes to the hostname files, such as
> including "advbase 1 advskew 1" to the primary, adding and removing the
> alias for .9 on the master, changing preempt=1 on the secondary, and none
of
> it makes any difference.  I continually see what (I think) should be the
> backup on the secondary server shown as a master (above), and it takes all
> the web server connections.  Other than my carp experiments, everything
> works perfectly fine.  I must be missing
> something, somewhere, but I'm out of clues.  Any pointers in the right
> direction would be appreciated,
> Thanks.
>
> --
>
> -RSM
>
>

I do believe preempt should be 1 on both servers. Let the advskew
handle which one is primary.

What do you see for output of 'netstat -s -p carp' and 'netstat -s -p pfsync'

-B



Re: carp master <-> backup problem

2009-10-28 Thread Scott

Marco Pfatschbacher wrote:

Hi,

I actually didn't read your entire mail..
but:

Having 192.168.0.9 on both the physical and the carp interface
cannot really work.   
  
Thanks for trying!  Unfortunately, I tried that as well (and double 
checked it again after your reply) where the carp IP is not assigned 
anywhere else.  Still the problem remains: the backup (secondary server) 
insists on being the master, and it is given priority when the firewall 
sends web traffic to the 192.168.0.9 address.


Unfortunately, the ifconfig output with both machines reading "MASTER" 
remains 100% identical to those in my original message, so I've ruled 
out that it's somehow a problem with the addresses being aliases.  I 
still have to mv the /etc/hostname.carp0 file to anything else and 
reboot for web traffic to flow to the primary server.  Grr.


--

-RSM

http://www.erratic.ca



carp master <-> backup problem

2009-10-27 Thread Scott

I must be missing something in my config, and I'd appreciate it if my
blunder could be pointed out to me.

I have two web servers behind a firewall (all machines are running
4.6-stable, generic kernel).  The firewall has rdr & pass rules to both 
web servers, with one commented out at a time.  I change it manually 
when I want to switch them.  This same setup has been working fine since 
4.4.  Generally, pf routes web traffic to the primary web server 
(192.168.0.9) but sometimes I use it's twin at 192.168.0.19.


Today I decided to try using carp to *not* load balance, but use the
primary and have the secondary kick in when I have the primary offline
for maintenance instead of me changing the pf rule by hand.  Simple
enough.  I read the man pages for carp and ifconfig, and read the
example in the FAQ.  (This will eventually be load balanced in the
future if I can get MySQL clustering to work on OpenBSD... haven't tried
that yet.)

The problem is that when I access my site from an external account, my
primary never gets used, the secondary takes all connections, and to 
make it worse, if the secondary (which is being used first) is taken 
offline, the primary doesn't even get touched.  I have to delete the 
carp i/f on the secondary and reboot the primary for web access to go 
back to normal.


On the primary web server:

$ sysctl net.inet.carp
net.inet.carp.allow=1
net.inet.carp.preempt=1
net.inet.carp.log=2

$ cat /etc/hostname.carp0:
inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 1 carpdev fxp0

$ cat /etc/hostname.fxp0
inet 192.168.0.2 255.255.255.0 NONE media 100baseTX mediaopt full-duplex
inet alias 192.168.0.9 255.255.255.0
inet alias 192.168.0.10 255.255.255.0
inet alias 192.168.0.11 255.255.255.0
inet alias 192.168.0.12 255.255.255.0
inet alias 192.168.0.13 255.255.255.0

$ ifconfig carp0
carp0: flags=8843 mtu 1500
   lladdr 00:00:5e:00:01:01
   priority: 0
   carp: MASTER carpdev fxp0 vhid 1 advbase 1 advskew 0
   groups: carp
   inet6 fe80::200:5eff:fe00:101%carp0 prefixlen 64 scopeid 0x5
   inet 192.168.0.9 netmask 0xff00 broadcast 192.168.0.255


On the secondary web server:

$ sysctl net.inet.carp
net.inet.carp.allow=1
net.inet.carp.preempt=0
net.inet.carp.log=2

$ cat /etc/hostname.carp0
inet 192.168.0.9 255.255.255.0 192.168.0.255 vhid 2 advbase 1 advskew
100 carpdev xl0

$ cat /etc/hostname.xl0
inet 192.168.0.3 255.255.255.0 NONE media 100baseTX mediaopt full-duplex
inet alias 192.168.0.20 255.255.255.0
inet alias 192.168.0.21 255.255.255.0
inet alias 192.168.0.22 255.255.255.0
inet alias 192.168.0.23 255.255.255.0

$ ifconfig carp0
carp0: flags=8843 mtu 1500
   lladdr 00:00:5e:00:01:02
   priority: 0
   carp: MASTER carpdev xl0 vhid 2 advbase 1 advskew 100
   groups: carp
   inet6 fe80::200:5eff:fe00:102%carp0 prefixlen 64 scopeid 0x5
   inet 192.168.0.9 netmask 0xff00 broadcast 192.168.0.255


I have tried making slight changes to the hostname files, such as
including "advbase 1 advskew 1" to the primary, adding and removing the
alias for .9 on the master, changing preempt=1 on the secondary, and 
none of it makes any difference.  I continually see what (I think) 
should be the backup on the secondary server shown as a master (above), 
and it takes all the web server connections.  Other than my carp 
experiments, everything works perfectly fine.  I must be missing

something, somewhere, but I'm out of clues.  Any pointers in the right
direction would be appreciated,
Thanks.

--

-RSM