Strange packets lost

2013-09-25 Thread Loïc BLOT
Hello all,
i have searched many options but i haven't any new idea.

I have 4 openbsd routers (2 on each site). Each router create a GRE
tunnel with it's pair.

Here is the configuration:

| S1R1 --- gre + ospf --- S2R1 |
LAN S1 (OSPF  RIP) |  | LAN S2 (OSPF  RIP)
| S1R2 --- gre + ospf --- S2R2 |

The routing rules are correct, ssh, http(s), smtp, ntp, ldap and many
other protocols works as expected between the two sites.

But i have a problem with my Avaya phones on S2 which need to contact
the S1 gatekeeper. Some packets are lost, and (by sniffing every
interface) i don't found where the packets goes.

If i capture LAN S1 link, i have this capture:

10:06:24.003479 192.168.238.121.56641  192.168.106.38.411: S
2621611805:2621611805(0) win 5840 mss 1460,sackOK,timestamp 4294948803
0,nop,wscale 4 (DF)
10:06:24.003607 192.168.106.38.411  192.168.238.121.56641: S
3090220105:3090220105(0) ack 2621611806 win 5840 mss 1460,nop,wscale 7
(DF)
10:06:24.018842 192.168.238.121.56641  192.168.106.38.411: . ack 1 win
365 (DF)
10:06:24.023582 192.168.238.121.56641  192.168.106.38.411: P 1:74(73)
ack 1 win 365 (DF)
10:06:24.023710 192.168.106.38.411  192.168.238.121.56641: . ack 74 win
46 (DF)
10:06:24.024086 192.168.106.38.411  192.168.238.121.56641: .
1:1461(1460) ack 74 win 46 (DF)
10:06:24.024329 192.168.106.38.411  192.168.238.121.56641: .
1461:2921(1460) ack 74 win 46 (DF)
10:06:27.017704 192.168.106.38.411  192.168.238.121.56641: .
1:1461(1460) ack 74 win 46 (DF)
10:06:33.017772 192.168.106.38.411  192.168.238.121.56641: .
1:1461(1460) ack 74 win 46 (DF)
10:06:45.017907 192.168.106.38.411  192.168.238.121.56641: .
1:1461(1460) ack 74 win 46 (DF)
10:07:09.018198 192.168.106.38.411  192.168.238.121.56641: .
1:1461(1460) ack 74 win 46 (DF)
10:07:57.018732 192.168.106.38.411  192.168.238.121.56641: .
1:1461(1460) ack 74 win 46 (DF)
10:08:24.019074 192.168.106.38.411  192.168.238.121.56641: FP
2921:4273(1352) ack 74 win 46 (DF)
10:08:24.034803 192.168.238.121.56641  192.168.106.38.411: . ack 1 win
365 (DF)

If i capture the GRE tunnel i have this capture:

10:06:23.987975 192.168.238.121.56641  192.168.106.38.411: S
2621611805:2621611805(0) win 5840 mss 1460,sackOK,timestamp 4294948803
0,nop,wscale 4 (DF)
10:06:24.003614 192.168.106.38.411  192.168.238.121.56641: S
3090220105:3090220105(0) ack 2621611806 win 5840 mss 1460,nop,wscale 7
(DF)
10:06:24.018833 192.168.238.121.56641  192.168.106.38.411: . ack 1 win
365 (DF)
10:06:24.023573 192.168.238.121.56641  192.168.106.38.411: P 1:74(73)
ack 1 win 365 (DF)
10:06:24.023716 192.168.106.38.411  192.168.238.121.56641: . ack 74 win
46 (DF)
10:08:24.019083 192.168.106.38.411  192.168.238.121.56641: FP
2921:4273(1352) ack 74 win 46 (DF)
10:08:24.034793 192.168.238.121.56641  192.168.106.38.411: . ack 1 win
365 (DF)

A part of the TCP transaction disappear and i don't know why. 
Have you got ideas ???

-- 
Best regards,
Loïc BLOT,
UNIX systems, security and network expert
http://www.unix-experience.fr



Re: Strange packets lost

2013-09-25 Thread Mike Belopuhov
On 25 September 2013 11:03, Loïc BLOT loic.b...@unix-experience.fr wrote:
 Hello all,
 i have searched many options but i haven't any new idea.

 I have 4 openbsd routers (2 on each site). Each router create a GRE
 tunnel with it's pair.

 Here is the configuration:

 | S1R1 --- gre + ospf --- S2R1 |
 LAN S1 (OSPF  RIP) |  | LAN S2 (OSPF  RIP)
 | S1R2 --- gre + ospf --- S2R2 |

 The routing rules are correct, ssh, http(s), smtp, ntp, ldap and many
 other protocols works as expected between the two sites.

 But i have a problem with my Avaya phones on S2 which need to contact
 the S1 gatekeeper. Some packets are lost, and (by sniffing every
 interface) i don't found where the packets goes.

 If i capture LAN S1 link, i have this capture:

 10:06:24.003479 192.168.238.121.56641  192.168.106.38.411: S
 2621611805:2621611805(0) win 5840 mss 1460,sackOK,timestamp 4294948803
 0,nop,wscale 4 (DF)
 10:06:24.003607 192.168.106.38.411  192.168.238.121.56641: S
 3090220105:3090220105(0) ack 2621611806 win 5840 mss 1460,nop,wscale 7
 (DF)
 10:06:24.018842 192.168.238.121.56641  192.168.106.38.411: . ack 1 win
 365 (DF)
 10:06:24.023582 192.168.238.121.56641  192.168.106.38.411: P 1:74(73)
 ack 1 win 365 (DF)
 10:06:24.023710 192.168.106.38.411  192.168.238.121.56641: . ack 74 win
 46 (DF)
 10:06:24.024086 192.168.106.38.411  192.168.238.121.56641: .
 1:1461(1460) ack 74 win 46 (DF)
 10:06:24.024329 192.168.106.38.411  192.168.238.121.56641: .
 1461:2921(1460) ack 74 win 46 (DF)
 10:06:27.017704 192.168.106.38.411  192.168.238.121.56641: .
 1:1461(1460) ack 74 win 46 (DF)
 10:06:33.017772 192.168.106.38.411  192.168.238.121.56641: .
 1:1461(1460) ack 74 win 46 (DF)
 10:06:45.017907 192.168.106.38.411  192.168.238.121.56641: .
 1:1461(1460) ack 74 win 46 (DF)
 10:07:09.018198 192.168.106.38.411  192.168.238.121.56641: .
 1:1461(1460) ack 74 win 46 (DF)
 10:07:57.018732 192.168.106.38.411  192.168.238.121.56641: .
 1:1461(1460) ack 74 win 46 (DF)
 10:08:24.019074 192.168.106.38.411  192.168.238.121.56641: FP
 2921:4273(1352) ack 74 win 46 (DF)
 10:08:24.034803 192.168.238.121.56641  192.168.106.38.411: . ack 1 win
 365 (DF)

 If i capture the GRE tunnel i have this capture:

 10:06:23.987975 192.168.238.121.56641  192.168.106.38.411: S
 2621611805:2621611805(0) win 5840 mss 1460,sackOK,timestamp 4294948803
 0,nop,wscale 4 (DF)
 10:06:24.003614 192.168.106.38.411  192.168.238.121.56641: S
 3090220105:3090220105(0) ack 2621611806 win 5840 mss 1460,nop,wscale 7
 (DF)
 10:06:24.018833 192.168.238.121.56641  192.168.106.38.411: . ack 1 win
 365 (DF)
 10:06:24.023573 192.168.238.121.56641  192.168.106.38.411: P 1:74(73)
 ack 1 win 365 (DF)
 10:06:24.023716 192.168.106.38.411  192.168.238.121.56641: . ack 74 win
 46 (DF)
 10:08:24.019083 192.168.106.38.411  192.168.238.121.56641: FP
 2921:4273(1352) ack 74 win 46 (DF)
 10:08:24.034793 192.168.238.121.56641  192.168.106.38.411: . ack 1 win
 365 (DF)

 A part of the TCP transaction disappear and i don't know why.
 Have you got ideas ???


this looks like a classical mtu problem.  gre tunnel lowers the mtu
and your tcp traffic uses mss of 1460 bytes and sets DF.  therefore
it gets dropped once the router figures out it can't send that much
data over the gre link.

possible solutions are using path mtu discovery on clients or making
sure their mtu is less than 1500 or doing forced fragmentation and
defragmentation on the router or configuring the application to use
smaller mss value (setsockopt TCP_MAXSEG).



Re: Strange packets lost

2013-09-25 Thread Loïc BLOT
Hello,
you are totally right ! I haven't thought about layer 2 problems.
But the problem is partially resolve, i have strange things with DF.
Port 80 is no-df but not port 411 (avaya cfg).

Here is a fragment of my pf config:

set skip on lo

set block-policy drop
set limit { states 10, src-nodes 8, table-entries 60 }

match in scrub (no-df)

block in log all
pass out all

...

pass in quick inet from toip_area_v4 to toip_area_v4 scrub (no-df)
no state


Is something wrong ?

-- 
Best regards,
Loïc BLOT,
UNIX systems, security and network expert
http://www.unix-experience.fr



Le mercredi 25 septembre 2013 à 14:23 +0200, Mike Belopuhov a écrit :
 On 25 September 2013He 11:03, Loïc BLOT loic.b...@unix-experience.fr wrote:
  Hello all,
  i have searched many options but i haven't any new idea.
 
  I have 4 openbsd routers (2 on each site). Each router create a GRE
  tunnel with it's pair.
 
  Here is the configuration:
 
  | S1R1 --- gre + ospf --- S2R1 |
  LAN S1 (OSPF  RIP) |  | LAN S2 (OSPF  RIP)
  | S1R2 --- gre + ospf --- S2R2 |
 
  The routing rules are correct, ssh, http(s), smtp, ntp, ldap and many
  other protocols works as expected between the two sites.
 
  But i have a problem with my Avaya phones on S2 which need to contact
  the S1 gatekeeper. Some packets are lost, and (by sniffing every
  interface) i don't found where the packets goes.
 
  If i capture LAN S1 link, i have this capture:
 
  10:06:24.003479 192.168.238.121.56641  192.168.106.38.411: S
  2621611805:2621611805(0) win 5840 mss 1460,sackOK,timestamp 4294948803
  0,nop,wscale 4 (DF)
  10:06:24.003607 192.168.106.38.411  192.168.238.121.56641: S
  3090220105:3090220105(0) ack 2621611806 win 5840 mss 1460,nop,wscale 7
  (DF)
  10:06:24.018842 192.168.238.121.56641  192.168.106.38.411: . ack 1 win
  365 (DF)
  10:06:24.023582 192.168.238.121.56641  192.168.106.38.411: P 1:74(73)
  ack 1 win 365 (DF)
  10:06:24.023710 192.168.106.38.411  192.168.238.121.56641: . ack 74 win
  46 (DF)
  10:06:24.024086 192.168.106.38.411  192.168.238.121.56641: .
  1:1461(1460) ack 74 win 46 (DF)
  10:06:24.024329 192.168.106.38.411  192.168.238.121.56641: .
  1461:2921(1460) ack 74 win 46 (DF)
  10:06:27.017704 192.168.106.38.411  192.168.238.121.56641: .
  1:1461(1460) ack 74 win 46 (DF)
  10:06:33.017772 192.168.106.38.411  192.168.238.121.56641: .
  1:1461(1460) ack 74 win 46 (DF)
  10:06:45.017907 192.168.106.38.411  192.168.238.121.56641: .
  1:1461(1460) ack 74 win 46 (DF)
  10:07:09.018198 192.168.106.38.411  192.168.238.121.56641: .
  1:1461(1460) ack 74 win 46 (DF)
  10:07:57.018732 192.168.106.38.411  192.168.238.121.56641: .
  1:1461(1460) ack 74 win 46 (DF)
  10:08:24.019074 192.168.106.38.411  192.168.238.121.56641: FP
  2921:4273(1352) ack 74 win 46 (DF)
  10:08:24.034803 192.168.238.121.56641  192.168.106.38.411: . ack 1 win
  365 (DF)
 
  If i capture the GRE tunnel i have this capture:
 
  10:06:23.987975 192.168.238.121.56641  192.168.106.38.411: S
  2621611805:2621611805(0) win 5840 mss 1460,sackOK,timestamp 4294948803
  0,nop,wscale 4 (DF)
  10:06:24.003614 192.168.106.38.411  192.168.238.121.56641: S
  3090220105:3090220105(0) ack 2621611806 win 5840 mss 1460,nop,wscale 7
  (DF)
  10:06:24.018833 192.168.238.121.56641  192.168.106.38.411: . ack 1 win
  365 (DF)
  10:06:24.023573 192.168.238.121.56641  192.168.106.38.411: P 1:74(73)
  ack 1 win 365 (DF)
  10:06:24.023716 192.168.106.38.411  192.168.238.121.56641: . ack 74 win
  46 (DF)
  10:08:24.019083 192.168.106.38.411  192.168.238.121.56641: FP
  2921:4273(1352) ack 74 win 46 (DF)
  10:08:24.034793 192.168.238.121.56641  192.168.106.38.411: . ack 1 win
  365 (DF)
 
  A part of the TCP transaction disappear and i don't know why.
  Have you got ideas ???
 
 
 this looks like a classical mtu problem.  gre tunnel lowers the mtu
 and your tcp traffic uses mss of 1460 bytes and sets DF.  therefore
 it gets dropped once the router figures out it can't send that much
 data over the gre link.
 
 possible solutions are using path mtu discovery on clients or making
 sure their mtu is less than 1500 or doing forced fragmentation and
 defragmentation on the router or configuring the application to use
 smaller mss value (setsockopt TCP_MAXSEG).