Re: Routing issues

2014-02-17 Thread Alex Mathiasen
Thank you! This solved my problem.

The limit was reached several times within few seconds. 

Give this man a medal. 

Best regards
 
Alex Mathiasen

-Oprindelig meddelelse-
Fra: owner-t...@openbsd.org [mailto:owner-t...@openbsd.org] På vegne af Philipp
Sendt: 16. februar 2014 19:19
Til: 'tech@openbsd.org'
Emne: Re: Routing issues

Am 16.02.2014 14:08 schrieb Stuart Henderson:
 Some ideas:
check that the pf statetable (full or src-con) is not overflowing.. 
lately I had 'no route' where it was just peeking over the limit of 10,000 
states spuriously. Went me crazy.

pfctl -sm ; pfctl -si -vv




Re: Routing issues

2014-02-17 Thread Philipp

Am 17.02.2014 09:22 schrieb Alex Mathiasen:

Thank you! This solved my problem.


Cheers.. found the hard way the other day.

There should really be some dmesg when state-tables overflow.
This silent dropping is wasting time in debugging such situations.

Sorry for talk instead of diff :-}



Re: Routing issues

2014-02-17 Thread Stuart Henderson
On 2014/02/17 09:35, Philipp wrote:
 Am 17.02.2014 09:22 schrieb Alex Mathiasen:
 Thank you! This solved my problem.
 
 Cheers.. found the hard way the other day.
 
 There should really be some dmesg when state-tables overflow.
 This silent dropping is wasting time in debugging such situations.
 
 Sorry for talk instead of diff :-}
 

Writing messages that show up in dmesg is not cheap, particularly
on systems with serial console.



Re: Routing issues

2014-02-17 Thread Stuart Henderson
On 2014/02/17 08:22, Alex Mathiasen wrote:
 Thank you! This solved my problem.
 
 The limit was reached several times within few seconds. 
 
 Give this man a medal. 
 
 Best regards
  
 Alex Mathiasen

Sounds like you are keeping state for all connections through your bgp
router then - if you just have the one router this is ok, but if you
have more ways for the packets to enter your network such that this
router doesn't see every packet associated with a connection, you may
need to take extra steps to cope with this.



Re: Routing issues

2014-02-17 Thread Philipp

Am 17.02.2014 12:22 schrieb Stuart Henderson:


Writing messages that show up in dmesg is not cheap, particularly
on systems with serial console.


Well, ok. How about pflog?



Re: Routing issues

2014-02-17 Thread Henning Brauer
* Philipp e1c1bac6253dc54a1e89ddc046585...@posteo.net [2014-02-17 13:04]:
 Am 17.02.2014 12:22 schrieb Stuart Henderson:
 Writing messages that show up in dmesg is not cheap, particularly
 on systems with serial console.
 Well, ok. How about pflog?

you made my day.

forgot the pflog format?

how do you emit such a maessage in pcap? as payload with a dummy
packet header? (N!!)

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services GmbH, http://bsws.de, Full-Service ISP
Secure Hosting, Mail and DNS Services. Dedicated Servers, Root to Fully Managed
Henning Brauer Consulting, http://henningbrauer.com/



Re: Routing issues

2014-02-17 Thread Philipp

Am 17.02.2014 13:11 schrieb Henning Brauer:

how do you emit such a maessage in pcap? as payload with a dummy
packet header? (N!!)


pf is taking action without telling anyone - and that's not nice.

There *are* other log() entries in pf.c already so I wonder how the 
initial

comment about 'slow via serial console' would qualify.

some blocked because of resource exhaustion reason for pflog_packet?

Just sayin...



Re: Routing issues

2014-02-17 Thread Mark Kettenis
 Date: Mon, 17 Feb 2014 11:22:27 +
 From: Stuart Henderson st...@openbsd.org
 
 On 2014/02/17 09:35, Philipp wrote:
  Am 17.02.2014 09:22 schrieb Alex Mathiasen:
  Thank you! This solved my problem.
  
  Cheers.. found the hard way the other day.
  
  There should really be some dmesg when state-tables overflow.
  This silent dropping is wasting time in debugging such situations.
  
  Sorry for talk instead of diff :-}
  
 
 Writing messages that show up in dmesg is not cheap, particularly
 on systems with serial console.

You'd obviously have to rate limit these somehow.  See ratecheck(9).



Re: Routing issues

2014-02-17 Thread Stuart Henderson
On 2014/02/17 13:36, Philipp wrote:
 Am 17.02.2014 13:11 schrieb Henning Brauer:
 how do you emit such a maessage in pcap? as payload with a dummy
 packet header? (N!!)
 
 pf is taking action without telling anyone - and that's not nice.
 
 There *are* other log() entries in pf.c already so I wonder how the initial
 comment about 'slow via serial console' would qualify.

But it is telling people, via the counters.

The log entries which are at risk of being printed frequently are
hidden by default, i.e. put behind LOG_DEBUG or similar. It seems to
me that increasing the state-limit counter is just as useful as adding
a new LOG_DEBUG for this..

As for pflog, surely somebody who knows how to look at pflog also knows
to look at pfctl -si?



Re: Routing issues

2014-02-17 Thread Henning Brauer
* Philipp e1c1bac6253dc54a1e89ddc046585...@posteo.net [2014-02-17 13:36]:
 Am 17.02.2014 13:11 schrieb Henning Brauer:
 how do you emit such a maessage in pcap? as payload with a dummy
 packet header? (N!!)
 pf is taking action without telling anyone - and that's not nice.

doesn't change a thing wrt pflog. pflog doesn't carry strings.

 There *are* other log() entries in pf.c already so I wonder how the initial
 comment about 'slow via serial console' would qualify.

logging to the console is generally bad and only for really critical
stuff.

and look at those log()s again, most aren't going to produce anything
with default settings.

right now, the memory counter gets increased when hitting the limit,
that isn't optimal imho.

 some blocked because of resource exhaustion reason for pflog_packet?

logging packets blocked thru sth else than a block rule is generally
worthwile, but then has to be done everywhere and not just that one
place.

you know he answer... where's your diff

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services GmbH, http://bsws.de, Full-Service ISP
Secure Hosting, Mail and DNS Services. Dedicated Servers, Root to Fully Managed
Henning Brauer Consulting, http://henningbrauer.com/



Re: Routing issues

2014-02-17 Thread Stuart Henderson
On 2014/02/17 12:56, Stuart Henderson wrote:
 The log entries which are at risk of being printed frequently are
 hidden by default, i.e. put behind LOG_DEBUG or similar. It seems to
 me that increasing the state-limit counter is just as useful as adding
 a new LOG_DEBUG for this..

Hmm. Well, I was assuming from the name and pfctl(8) description that
it should be state-limit, but actually it seems that is just used for
max-src-states and this case just falls under memory which is not
too descriptive.

I don't see a specific do we exceed max-states check, just a
pool_get failed when trying to get memory for a new state.
I wonder about adding a separate check to give better logging,
though this is code that needs to run *fast*...

The current use of PFRES_MAXSTATES particularly with pfctl's textual
form state-limit is definitely a bit confusing.



Re: Routing issues

2014-02-17 Thread Henning Brauer
* Stuart Henderson st...@openbsd.org [2014-02-17 14:45]:
 Hmm. Well, I was assuming from the name and pfctl(8) description that
 it should be state-limit, but actually it seems that is just used for
 max-src-states and this case just falls under memory which is not
 too descriptive.

indeed.

 I don't see a specific do we exceed max-states check, just a
 pool_get failed when trying to get memory for a new state.

yes, that's how it works. the limit is set as pool limit.
fairy tale: that comes from the oold days when kernel memory
management wasn't what it is today, but rather a pile of static poo.
back then, running a pool out of memory would panic the machine.

 I wonder about adding a separate check to give better logging,
 though this is code that needs to run *fast*...

a simple check at state creation time is ok.

 The current use of PFRES_MAXSTATES particularly with pfctl's textual
 form state-limit is definitely a bit confusing.

yup.

the default of 1 might be a bit small today as well. it's not like
a higher one would cost anything these days. 100k?

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services GmbH, http://bsws.de, Full-Service ISP
Secure Hosting, Mail and DNS Services. Dedicated Servers, Root to Fully Managed
Henning Brauer Consulting, http://henningbrauer.com/



Re: Routing issues

2014-02-17 Thread Claudio Jeker
On Mon, Feb 17, 2014 at 03:21:53PM +0100, Henning Brauer wrote:
 * Stuart Henderson st...@openbsd.org [2014-02-17 14:45]:
  Hmm. Well, I was assuming from the name and pfctl(8) description that
  it should be state-limit, but actually it seems that is just used for
  max-src-states and this case just falls under memory which is not
  too descriptive.
 
 indeed.
 
  I don't see a specific do we exceed max-states check, just a
  pool_get failed when trying to get memory for a new state.
 
 yes, that's how it works. the limit is set as pool limit.
 fairy tale: that comes from the oold days when kernel memory
 management wasn't what it is today, but rather a pile of static poo.
 back then, running a pool out of memory would panic the machine.
 
  I wonder about adding a separate check to give better logging,
  though this is code that needs to run *fast*...
 
 a simple check at state creation time is ok.
 
  The current use of PFRES_MAXSTATES particularly with pfctl's textual
  form state-limit is definitely a bit confusing.
 
 yup.
 
 the default of 1 might be a bit small today as well. it's not like
 a higher one would cost anything these days. 100k?
 

How much memory are 10'000 states using these days?
Would it be possible to auto tune these values somehow?

One issue I have seen is that because of adaptive timeouts you can end up
with failing connections without hitting the hard state limit.
I think those connections will not show up in the stats (I could be
wrong).

-- 
:wq Claudio



Re: Routing issues

2014-02-17 Thread Henning Brauer
* Claudio Jeker cje...@diehard.n-r-g.com [2014-02-17 16:27]:
 How much memory are 10'000 states using these days?

bit over 4MB with one state key per state.

 Would it be possible to auto tune these values somehow?

I keep thinking about it - maybe sth based on physmem.

 One issue I have seen is that because of adaptive timeouts you can end up
 with failing connections without hitting the hard state limit.
 I think those connections will not show up in the stats (I could be
 wrong).

failing connections because of adaptive timeouts? HUH?

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services GmbH, http://bsws.de, Full-Service ISP
Secure Hosting, Mail and DNS Services. Dedicated Servers, Root to Fully Managed
Henning Brauer Consulting, http://henningbrauer.com/



Re: Routing issues

2014-02-16 Thread Stuart Henderson
On 2014/02/14 13:03, Alex Mathiasen wrote:
 Hello,
 
 First of all: I hope I am posting this to the correct maillinglist, if not 
 then I'm sorry!
 
 I am having big issues with my OpenBSD 5.4 (Also had these issues prior to 
 upgrading to 5.4). The server is a complete new installation - I have tried 
 this setup with 3 different servers from different manufactures, and 4 
 different network cards (HP 100 Mbit, HP 4x1 Gbit, Intel 2x1 Gbit, Trend Net 
 1Gbit). The server is loaded with 4Gigs of RAM, and have plenty of resources 
 available. Current load is 0.10. Kernel have not been modified or altered.
 
 The setup is as following: BGPD configured, routing enabled. The BGPD works 
 fine, I get all the prefixes loaded, as seen below.
 
 # bgpctl show
 Neighbor   ASMsgRcvdMsgSent  OutQ Up/Down  
 State/PrfRcvd
 TDC 3292  
 82071 16 0 00:12:47 476299
 
 This is my sysctl.conf (kern.bufcache and net.inet.ip was added trying to 
 resolve this issue, without result.)
 
 net.inet.ip.forwarding=1# 1=Permit forwarding (routing) of IPv4 
 packets
 kern.bufcachepercent=50

Remove this bufcachepercent=50 line, it is useless on a router and
potentially harmful.

 net.inet.ip.ifq.maxlen=512
 
 The issue is: I am having big diffeculties with routing my packets both to 
 internal hosts, and external hosts. Periodically when tracing/pinging from my 
 OpenBSD, it just can't route successfully. This also affect my ingoing and 
 outgoing traffic, by resulting in lost packets.
 
 This is an example of attempting to ping:
 # ping 8.8.8.8
 PING 8.8.8.8 (8.8.8.8): 56 data bytes
 ping: sendto: No route to host
 ping: wrote 8.8.8.8 64 chars, ret=-1
 ping: sendto: No route to host
 ping: wrote 8.8.8.8 64 chars, ret=-1
 ping: sendto: No route to host
 ping: wrote 8.8.8.8 64 chars, ret=-1
 ping: sendto: No route to host
 ping: wrote 8.8.8.8 64 chars, ret=-1
 ping: sendto: No route to host
 ping: wrote 8.8.8.8 64 chars, ret=-1
 64 bytes from 8.8.8.8: icmp_seq=5 ttl=51 time=23.881 ms
 64 bytes from 8.8.8.8: icmp_seq=6 ttl=51 time=22.117 ms
 
 Second attempt:
 # ping 8.8.8.8
 PING 8.8.8.8 (8.8.8.8): 56 data bytes
 ping: sendto: No route to host
 ping: wrote 8.8.8.8 64 chars, ret=-1
 ping: sendto: No route to host
 ping: wrote 8.8.8.8 64 chars, ret=-1
 ping: sendto: No route to host
 ping: wrote 8.8.8.8 64 chars, ret=-1
 64 bytes from 8.8.8.8: icmp_seq=3 ttl=51 time=22.276 ms
 64 bytes from 8.8.8.8: icmp_seq=4 ttl=51 time=22.315 ms
 
 Third attempt:
 # ping 8.8.8.8
 PING 8.8.8.8 (8.8.8.8): 56 data bytes
 64 bytes from 8.8.8.8: icmp_seq=0 ttl=51 time=22.356 ms
 64 bytes from 8.8.8.8: icmp_seq=1 ttl=51 time=22.309 ms
 
 And this just keeps going on, sometimes 100% sucessfully, sometimes with 2-xx 
 packets lost before routing is successful.
 
 Trace-routes to internal hosts:
 
 # traceroute 212.70.x.x
 traceroute to 212.70.x.x (212.70.x.x), 64 hops max, 40 byte packets
 1  firewall (212.70.x.x5)  0.260 ms  0.224 ms  0.111 ms
 2  php (212.70.x.x)  0.496 ms  0.484 ms  0.352 ms
 
 Second attempt:
 # traceroute 212.70.x.x
 traceroute to 212.70.x.x (212.70.x.x), 64 hops max, 40 byte packets
 1  firewall (212.70.x.x5)  0.176 ms  0.223 ms  0.235 ms
 2  php (212.70.x.x)  0.483 ms  0.474 ms  0.363 ms
 
 Third attempt:
 # traceroute 212.70.x.x
 traceroute to 212.70.x.x (212.70.x.x), 64 hops max, 40 byte packets
 sendto: No route to host
 1 traceroute: wrote 212.70.x.x 40 chars, ret=-1
 *sendto: No route to host
 traceroute: wrote 212.70.x.x 40 chars, ret=-1
 *sendto: No route to host
 traceroute: wrote 212.70.x.x 40 chars, ret=-1
 *sendto: No route to host
 2 traceroute: wrote 212.70.x.x 40 chars, ret=-1
 *sendto: No route to host
 traceroute: wrote 212.70.x.x 40 chars, ret=-1
 *sendto: No route to host
 traceroute: wrote 212.70.x.x 40 chars, ret=-1
 *
 3  php (212.70.x.x)  0.416 ms  0.482 ms  0.474 ms
 
 Any suggestions on how I can further debug this issue, or possible resolve 
 this once in for all? I can grant access to the server as well, if anyone 
 feels like debugging.
 
 Looking forward to some replies. Thank you in advance.
 
 Best regards, Alex Mathiasen

Some ideas:

- Check bgpd log entries in /var/log/daemon

- Check kernel log entries in dmesg

- Examine things both when it's working fand when it's not working
and compare to spot differences, for example:

bgpctl show nexthop
route -n get bgp peer's address
route -n get failing destination
ifconfig -A



Routing issues

2014-02-14 Thread Alex Mathiasen
Hello,

First of all: I hope I am posting this to the correct maillinglist, if not then 
I'm sorry!

I am having big issues with my OpenBSD 5.4 (Also had these issues prior to 
upgrading to 5.4). The server is a complete new installation - I have tried 
this setup with 3 different servers from different manufactures, and 4 
different network cards (HP 100 Mbit, HP 4x1 Gbit, Intel 2x1 Gbit, Trend Net 
1Gbit). The server is loaded with 4Gigs of RAM, and have plenty of resources 
available. Current load is 0.10. Kernel have not been modified or altered.

The setup is as following: BGPD configured, routing enabled. The BGPD works 
fine, I get all the prefixes loaded, as seen below.

# bgpctl show
Neighbor   ASMsgRcvdMsgSent  OutQ Up/Down  State/PrfRcvd
TDC 3292  82071 
16 0 00:12:47 476299

This is my sysctl.conf (kern.bufcache and net.inet.ip was added trying to 
resolve this issue, without result.)

net.inet.ip.forwarding=1# 1=Permit forwarding (routing) of IPv4 packets
kern.bufcachepercent=50
net.inet.ip.ifq.maxlen=512

The issue is: I am having big diffeculties with routing my packets both to 
internal hosts, and external hosts. Periodically when tracing/pinging from my 
OpenBSD, it just can't route successfully. This also affect my ingoing and 
outgoing traffic, by resulting in lost packets.

This is an example of attempting to ping:
# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
ping: sendto: No route to host
ping: wrote 8.8.8.8 64 chars, ret=-1
ping: sendto: No route to host
ping: wrote 8.8.8.8 64 chars, ret=-1
ping: sendto: No route to host
ping: wrote 8.8.8.8 64 chars, ret=-1
ping: sendto: No route to host
ping: wrote 8.8.8.8 64 chars, ret=-1
ping: sendto: No route to host
ping: wrote 8.8.8.8 64 chars, ret=-1
64 bytes from 8.8.8.8: icmp_seq=5 ttl=51 time=23.881 ms
64 bytes from 8.8.8.8: icmp_seq=6 ttl=51 time=22.117 ms

Second attempt:
# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
ping: sendto: No route to host
ping: wrote 8.8.8.8 64 chars, ret=-1
ping: sendto: No route to host
ping: wrote 8.8.8.8 64 chars, ret=-1
ping: sendto: No route to host
ping: wrote 8.8.8.8 64 chars, ret=-1
64 bytes from 8.8.8.8: icmp_seq=3 ttl=51 time=22.276 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=51 time=22.315 ms

Third attempt:
# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: icmp_seq=0 ttl=51 time=22.356 ms
64 bytes from 8.8.8.8: icmp_seq=1 ttl=51 time=22.309 ms

And this just keeps going on, sometimes 100% sucessfully, sometimes with 2-xx 
packets lost before routing is successful.

Trace-routes to internal hosts:

# traceroute 212.70.x.x
traceroute to 212.70.x.x (212.70.x.x), 64 hops max, 40 byte packets
1  firewall (212.70.x.x5)  0.260 ms  0.224 ms  0.111 ms
2  php (212.70.x.x)  0.496 ms  0.484 ms  0.352 ms

Second attempt:
# traceroute 212.70.x.x
traceroute to 212.70.x.x (212.70.x.x), 64 hops max, 40 byte packets
1  firewall (212.70.x.x5)  0.176 ms  0.223 ms  0.235 ms
2  php (212.70.x.x)  0.483 ms  0.474 ms  0.363 ms

Third attempt:
# traceroute 212.70.x.x
traceroute to 212.70.x.x (212.70.x.x), 64 hops max, 40 byte packets
sendto: No route to host
1 traceroute: wrote 212.70.x.x 40 chars, ret=-1
*sendto: No route to host
traceroute: wrote 212.70.x.x 40 chars, ret=-1
*sendto: No route to host
traceroute: wrote 212.70.x.x 40 chars, ret=-1
*sendto: No route to host
2 traceroute: wrote 212.70.x.x 40 chars, ret=-1
*sendto: No route to host
traceroute: wrote 212.70.x.x 40 chars, ret=-1
*sendto: No route to host
traceroute: wrote 212.70.x.x 40 chars, ret=-1
*
3  php (212.70.x.x)  0.416 ms  0.482 ms  0.474 ms

Any suggestions on how I can further debug this issue, or possible resolve this 
once in for all? I can grant access to the server as well, if anyone feels like 
debugging.

Looking forward to some replies. Thank you in advance.

Best regards, Alex Mathiasen