On 08/09/2018 01:38 AM, Vieri Di Paola via Shorewall-users wrote:
> Hi,
> 
> I've encountered a weird issue.
> 
> I have 3 ISP links (WAN) connected to a shorewall gateway, each on their own 
> NIC.
> 
> After about 24 hours working with apparently no issues, I start to get 
> network issues on only one of the three.
> 
> A simple test from the shorewall gateway shows the following packet loss when 
> pinging from the NIC that's connected to the failing ISP:
> 
> # shorewall reset ;  ping -n -I enp9s6 8.8.8.8 ; shorewall dump > 
> /home/vieri/swdump
> Shorewall Counters Reset
> PING 8.8.8.8 (8.8.8.8) from 192.168.101.2 enp9s6: 56(84) bytes of data.
> 64 bytes from 8.8.8.8: icmp_seq=12 ttl=120 time=11.1 ms
> 64 bytes from 8.8.8.8: icmp_seq=13 ttl=120 time=11.1 ms
> 64 bytes from 8.8.8.8: icmp_seq=14 ttl=120 time=11.1 ms
> 64 bytes from 8.8.8.8: icmp_seq=15 ttl=120 time=10.9 ms
> 64 bytes from 8.8.8.8: icmp_seq=16 ttl=120 time=11.0 ms
> 64 bytes from 8.8.8.8: icmp_seq=17 ttl=120 time=11.0 ms
> 64 bytes from 8.8.8.8: icmp_seq=18 ttl=120 time=11.2 ms
> 64 bytes from 8.8.8.8: icmp_seq=19 ttl=120 time=11.1 ms
> 64 bytes from 8.8.8.8: icmp_seq=20 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=21 ttl=120 time=11.1 ms
> 64 bytes from 8.8.8.8: icmp_seq=22 ttl=120 time=11.1 ms
> 64 bytes from 8.8.8.8: icmp_seq=23 ttl=120 time=11.2 ms
> 64 bytes from 8.8.8.8: icmp_seq=24 ttl=120 time=11.2 ms
> 64 bytes from 8.8.8.8: icmp_seq=25 ttl=120 time=11.2 ms
> 64 bytes from 8.8.8.8: icmp_seq=26 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=27 ttl=120 time=11.2 ms
> 64 bytes from 8.8.8.8: icmp_seq=28 ttl=120 time=11.2 ms
> 64 bytes from 8.8.8.8: icmp_seq=29 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=30 ttl=120 time=11.4 ms
> 64 bytes from 8.8.8.8: icmp_seq=31 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=32 ttl=120 time=11.2 ms
> 64 bytes from 8.8.8.8: icmp_seq=33 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=34 ttl=120 time=11.5 ms
> 64 bytes from 8.8.8.8: icmp_seq=35 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=36 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=37 ttl=120 time=11.4 ms
> 64 bytes from 8.8.8.8: icmp_seq=38 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=39 ttl=120 time=11.3 ms
> 64 bytes from 8.8.8.8: icmp_seq=40 ttl=120 time=11.8 ms
> 64 bytes from 8.8.8.8: icmp_seq=41 ttl=120 time=11.6 ms
> 64 bytes from 8.8.8.8: icmp_seq=42 ttl=120 time=11.4 ms
> ^C
> --- 8.8.8.8 ping statistics ---
> 42 packets transmitted, 31 received, 26% packet loss, time 41698ms
> rtt min/avg/max/mdev = 10.981/11.303/11.890/0.212 ms
> 
> The same test on the other 2 ISP links are OK.
> 
> Hence, if ISP3 is the failing link and ISP1, ISP2 are OK, I try to move some 
> traffic from ISP3 to ISP2 like so in the mangle file: 
> 
> MARK(2):P       ${HMAN_EXTRA_CORP_NETWORKS}
> (2: ISP2, 3: ISP3, 
> HMAN_EXTRA_CORP_NETWORKS="192.168.210.0/23,192.168.212.0/24")
> 
> Now, the same ping test from the NIC that's connected to ISP2 starts showing 
> the same packet loss stats while the test on the NIC connected to ISP3 has 0% 
> packet loss.
> 
> Wherever I move the traffic with this line in the mangle file, I get ICMP 
> packet loss, ie., moving it back to MARK(3) (ISP3) shows packet loss again 
> only on that line.
> 
> The shorewall dump taken during the test above is here:
> 
> https://drive.google.com/open?id=1a6RlQhi2w_JJF9ZuFt6aI9G-JAQbFC9n
> 
> Finally, to top it all off, if I reboot the modem/router on the ISP3 link, 
> all's well again (no packet loss whatsoever, no matter which rule I use in 
> the mangle file). Until the next day...
> 
> So, how can I go about this to determine what's causing this issue? My 
> Internet Provider has already passed the buck and thinks that it's an issue 
> with my shorewall gateway...
> 
> Help appreciated.
> 

I don't see anything in the dump that explains this behavior. I do,
however, notice this conntrack table entry:

icmp     1 29 src=192.168.101.2 dst=8.8.8.8 type=8 code=0 id=3380
packets=42 bytes=3528 src=8.8.8.8 dst=192.168.101.2 type=0 code=0
id=3380 packets=31 bytes=2604 mark=3 use=1

'mark=3' indicates that the flow is using the correct interface (enp9s6).

My suggestion for debugging this further is to use a packet sniffer to
see what is happening on the wire during the period of loss:

a) Are the echo-request packets being sent?
b) If not, is there unsuccessful ARPing occurring?

-Tom
-- 
Tom Eastep        \   Q: What do you get when you cross a mobster with
Shoreline,         \     an international standard?
Washington, USA     \ A: Someone who makes you an offer you can't
http://shorewall.org \   understand
                      \_______________________________________________

Attachment: signature.asc
Description: OpenPGP digital signature

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Shorewall-users mailing list
Shorewall-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/shorewall-users

Reply via email to