Re: Incorrect NAT translation for sip traffic ?

Stuart Henderson Thu, 23 Jun 2011 02:56:37 -0700

On 2011-06-23, Magnus Rixtorp <[email protected]> wrote:
> pass out quick log on $ext_if inet from 192.168.0.0/24 nat-to $ext_if
> pass out quick log on $ext_if inet from 192.168.230.0/24 nat-to $ext_if
> pass out quick log on $ext_if inet from 192.168.231.0/24 nat-to $ext_if
> pass out quick log on $ext_if inet from 192.168.239.0/24 nat-to $ext_if
> pass out quick log on $ext_if inet from 192.168.240.0/24 nat-to $ext_if
> pass out quick log on $ext_if inet from 192.168.241.0/24 nat-to $ext_if
> pass out quick log on $ext_if inet from 192.168.242.0/24 nat-to $ext_if


This probably isn't your problem, but that seems quite a lot of networks
to be natting behind a single IP especially with the default port range
(50001:65535). if you've got a lot of active natted states, the
search for a free port could involve a bunch of state searches
(pick a random port, lookup state to see if it's used, then search
sequentially for a free port looking up state each time).

So if you do have a lot of states you might want to either add more IPs
or increase the port range available (e.g. pass...nat-to $ext_if \
port 20000:65535) and adjust the net.inet.ip.port* sysctls for
connections coming from the firewall itself (to make sure you have
some free ports which don't conflict with the range used by that
PF rule).

> Jun 15 09:41:21 pbxfw /bsd: pf: state key linking mismatch! dir=OUT, 
> if=re0, stored af=2, a0: 130.244.190.46:5060, a1: 192.168.230.101:5060, 
> proto=17, found af=2, a0: 192.168.230.101:5060, a1: 
> 187.170.255.239:5060, proto=17
> Jun 17 12:02:55 pbxfw /bsd: pf: state key linking mismatch! dir=OUT, 
> if=re0, stored af=2, a0: 130.244.190.46:5060, a1: 192.168.230.101:5060, 
> proto=17, found af=2, a0: 192.168.230.101:5060, a1: 
> 187.170.255.239:5060, proto=17
>
> Is the only error output ive found on the problem.
>
> So the problem, has to do with the ip 187.170.255.239,
> 239.255.170.187.in-addr.arpa domain name pointer 
> dsl-187-170-255-239-dyn.prod-infinitum.com.mx.
> Our system has no relation at all with this ip.
> But somehow our NAT translation at random intervals, decides to 
> redirects traffic to that ip instead of the intended destination.
> Sofar we have primarily noted the problem towards 130.244.190.46 and 
> 130.244.190.42, that are our providers sip gateways.
> Since the only thing beeing used on the connection is a PBx solution.
>
> A google on that perticular IP, gives a simular dmesg error output in 
> this post:
> http://www.mail-archive.com/[email protected]/msg95116.html
> But in his case, the system hangs, our system keeps on going.
> And instead interferes with the connection of phonecalls.
>
> since the problem was discovered ive set up pf to log the first packet 
> of every new state,
> and then that is tcpdump thru tcpdump -n -e -ttt -s 1600 -vvv -XX to a 
> ascii log using the
> http://www.openbsd.org/faq/pf/logging.html syslog method.
>
> Jun 22 15:40:06.212694 rule 26/(match) [uid 0, pid 20284] pass in on 
> bge0: 130.244.190.46.5060 > 212.247.80.66.5060: udp 442 (DF) [tos 0xb8] 
> (ttl 56, id 0, len 470)
>    0000: 45b8 01d6 0000 4000 3811 da02 82f4 be2e 
> E\M-8.\[email protected].\M-Z..\M-t\M->.
>    0010: d4f7 5042 13c4 13c4 01c2 f6b9 4259 4520 
> \M-T\M-wPB.\M-D.\M-D.\M-B\M-v\M-9BYE
>    0020: 7369 703a 3835 3933 4032 3132 2e32 3437 sip:[email protected]
>    0030: 2e38 302e 3636 2053 4950 2f32            .80.66 SIP/2
>
> Jun 22 15:40:06.307515 rule 60/(match) [uid 0, pid 20284] pass in on 
> re0: 192.168.230.101.5060 > 187.170.255.239.5060: udp 550 (ttl 64, id 
> 33961, len 578)
>    0000: 4500 0242 84a9 0000 4011 9159 c0a8 e665 
> E..B.\M-)[email protected]\M-@\M-(\M-fe
>    0010: bbaa ffef 13c4 13c4 022e 9dc3 5349 502f 
> \M-;\M-*\M^?\M-o.\M-D.\M-D...\M-CSIP/
>    0020: 322e 3020 3230 3020 4f4b 0d0a 5669 613a  2.0 200 OK..Via:
>    0030: 2053 4950 2f32 2e30 2f55 4450             SIP/2.0/UDP

Considering this snippet alone, there's no indication of a problem
with PF; it looks to me like 192.168.230.101 is itself sending
packets to 187.170.255.239, maybe your PBX software is confused.

I would look at packets on the inbound/outbound interfaces rather
than pflog and see what addresses show up there. ("tcpdump -Xs1500
-nire0 port 5060" or something, and same for bge0).

The xxx.255.239 makes me wonder if the PBX is trying to do some
multicast thing and getting the byte-order wrong (239.255.xxx would
be a multicast address).

> Jun 22 15:40:06.307526 rule 0/(match) [uid 0, pid 20284] pass out on 
> bge0: 192.168.230.101.5060 > 187.170.255.239.5060: udp 550 (ttl 63, id 
> 33961, len 578, bad cksum 9159! differs by 100)
>    0000: 4500 0242 84a9 0000 3f11 9159 c0a8 e665 
> E..B.\M-)..?..Y\M-@\M-(\M-fe
>    0010: bbaa ffef 13c4 13c4 022e 9dc3 5349 502f 
> \M-;\M-*\M^?\M-o.\M-D.\M-D...\M-CSIP/
>    0020: 322e 3020 3230 3020 4f4b 0d0a 5669 613a  2.0 200 OK..Via:
>    0030: 2053 4950 2f32 2e30 2f55 4450             SIP/2.0/UDP
>
> and on a side note, if anyone has a suggestion how to actually get the 
> complete package logged, and not just the first snap, it would be nice,
> openbsd tcpdump seems to not support -s 0 as snaplen, to get the whole 
> thing.

see tcpdump(8) about -s (or ngrep has fairly clear formatting for
reading inside sip packets, "ngrep -d re0 -W byline port 5060",
though less information from the IP/TCP header is displayed).

> anyway, that log snippet, is 130.244.190.46 asking us to setup a sip 
> connection with them on 5060,
> but our respond to that ip, goes to 187.170.255.239. and the connection 
> fails.
>
> another side note would be about the rampant amount of bad ckdsum on udp 
> traffic, if anyone would care to chime in about that.
> Since about 98% of all udp packets get a bad cksum.

see tcpdump(8) about IP Checksum Offload.


>
> but my main problem and concern is this 187.170.255.239, and why they 
> should get my phonecalls.
>
> Regards
>
> Magnus

Re: Incorrect NAT translation for sip traffic ?

Reply via email to