The problem exists even if I use the system's "/usr/bin/false" and
"/usr/bin/true" commands.
The problem exists even when PF is disabled or the only rule is "pass in".

That being said the script itself is a simple host lookup against the
IP addresses to ensure the DNS server is actually resolving.   Again,
just using "/usr/bin/false" or "/usr/bin/true" produces the same drop
in throughput.

An example of the drop looks like this:
9913Mbps
9913Mbps
7253Mbps <--- script interval point
9913Mbps
9913Mbps
...etc...

When the script is an actual shell script rather than /usr/bin/false,
the throughput drops spans the three seconds surrounding the time the
script runs.
9913Mbps
9913Mbps
4321Mbps
7253Mbps <--- script interval point
5162Mbps
9913Mbps
9913Mbps

# relayd.conf (somewhat sterilized):
table <dns-servers> { 192.168.1.1, 192.168.1.2 }
redirect dns-udp {
  listen on 192.168.100.1 udp port 53
  forward to <dns-servers> port 53     \
  check script "/usr/bin/false"              \
  timeout 4000                                    \
  interval 15                                        \
  mode roundrobin
}
redirect dns-tcp {
  listen on 192.168.100.1 port 53
  forward to <dns-servers> port 53     \
  check script "/usr/bin/false"              \
  timeout 4000                                    \
  interval 15                                        \
  mode roundrobin
}

On Mon, Jul 30, 2012 at 8:59 AM, Gregory Edigarov <ediga...@cupid.com> wrote:
> On 07/30/2012 03:25 PM, Bennett Samowich wrote:
>>
>> I've uncovered a troubling performance symptom that I believe is
>> related to relayd's "check script" functionality.
>>
>> The system is a Dell R710 with 12GB RAM and 10Gb interfaces.  The
>> problem is that when relayd is running with redirects that uses the
>> check script functionality, performance of the interface drops around
>> 30% while the check script is running.
>>
>> I ran the tests in an offline configuration so no other traffic could
>> be a factor ( test1 <--> OpenBSD <--> test2 ).  Tests were performed
>> using the nuttcp tool and both servers ( test1 & test2 ) pull
>> line-rate 9.912Gbps when connected back-to-back.  When run through the
>> OpenBSD firewall, regardless of PF rules, the rate drops to 7.25Gbps
>> when the script runs.
>>
>> At first I thought it was my script but I replaced my script with
>> 'true', 'false' and the problem still remained.  I've validated that
>> this exists in versions 4.8 through 5.1.   I've also tried looking at
>> the relayd code but it seemed like a reasonable exec call.  I can't
>> seem to understand why a running script would cause a network
>> performance drop.  I would also bet that this only noticeable over
>> 10Gb interfaces.  Nevertheless, with check script running every 15
>> seconds we've succumbed to an overall drop in network performance.
>
> Sorry, you do not give a full information. What's in your script? what's in
> your relayd.conf?
> what are your pf rules? dmesg is also welcome.
>
>> Any insight or direction would be greatly appreciated.
>>
>> Bennett
>>
> --

Reply via email to