Hi,

I think this is state related, as that's when i've seen this symptom before
- traffic being dropped even though there are rules to permit it. In this
case, I can't see why though.

I have two OpenBSD hosts, both doing BGP. Let's call them gw1 and gw2 here.
I've replaced my real loopback IP with 172.16.0.x. When running rpki-client
on them, I found one had double the ROAs of the other. After some
investigation, I found that RIPE is missing from gw2 because it can't talk
to rpki.ripe.net.

I'm using: "!route sourceaddr -ifp lo1" on both hosts, so outgoing traffic
from the hosts themselves originate from the loopback address.

If I ping with a source address of the loopback, it works from gw1, but not
from gw2:

ichilton@gw1:~$ ping -I 172.16.0.90 -v rpki.ripe.net
PING rpki.ripe.net (172.16.0.90 --> 193.0.6.138): 56 data bytes
64 bytes from 193.0.6.138: icmp_seq=0 ttl=252 time=7.671 ms
64 bytes from 193.0.6.138: icmp_seq=1 ttl=252 time=7.580 ms

ichilton@gw2:~$ ping -I 172.16.0.91 -v rpki.ripe.net
PING rpki.ripe.net (172.16.0.91 --> 193.0.6.138): 56 data bytes
^C
--- rpki.ripe.net ping statistics ---
2 packets transmitted, 0 packets received, 100.0% packet loss

The outbound path for that is out of a connected transit interface and the
inbound path is a transit interface on gw1.

When pinging from gw2, I see the echo request go out the correct interface
on gw2:

root@gw2:~# tcpdump -i vlan367 -n host 193.0.6.138
tcpdump: listening on vlan367, link-type EN10MB
09:32:18.837929 172.16.0.91 > 193.0.6.138: icmp: echo request
09:32:19.837919 172.16.0.91 > 193.0.6.138: icmp: echo request

I see the reply come in on the transit interface on gw1:

root@gw1:~# tcpdump -i vlan313 -n host 193.0.6.138
tcpdump: listening on vlan313, link-type EN10MB
09:33:00.835072 193.0.6.138 > 172.16.0.91: icmp: echo reply (DF)
09:33:01.835271 193.0.6.138 > 172.16.0.91: icmp: echo reply (DF)

Then it should route over the linknet interface, vlan409. However, the
replies are not there. They are dropped somewhere between the interfaces.

This is where it's interesting.

The relevant parts of my ruleset are:

set skip on lo
block all
pass out quick on linknet from (self)
pass out quick on { admin, external, linknet } proto { tcp, udp }
pass quick proto { icmp, icmp6 }

So ICMP is allowed full stop.. as is outgoing on the linknet.

If I disable pf or comment out the 'block all', then it instantly starts
working - I see the echo replies start flowing on the linknet (vlan409)
interface and pings succeed. As soon as I re-instate 'block all', it stops
again. If I put 'pass all' above or below 'block all', it doesn't help, as
doesn't ''pass quick all', 'pass quick on vlan409' or anything that should
otherwise pass the traffic (which is unsurprising as i've already got 'pass
quick proto icmp'.

If I add 'log' to the block all, I can see on pflog0 that it's the 'block
all' rule which is blocking it.

I've seen this similar behaviour before I set up pfsync, but in this case
it seems to be working fine and both hosts have a state entry which is
created when I start pings:

root@gw2:~# pfctl -ss |grep 193.0.6.138
all icmp 172.16.0.91:8123 -> 193.0.6.138:8       0:0

root@gw1:~# pfctl -ss |grep 193.0.6.138
all icmp 172.16.0.91:8123 -> 193.0.6.138:8       0:0

Is anyone able to shed any light on what's going on?

Thanks,

Ian

Reply via email to