Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-07-13 Thread Chris Buechler
On Mon, Jul 9, 2012 at 8:09 AM, Paul Gear  wrote:
>
> I'm happy to consider running 2.1 in production.  Is reason to believe
> that the Broadcom drivers are considerably improved in the 8.3 kernel?
>

I haven't seen any issues with them. Granted I haven't seen the
serious issues you have on 8.1 either, any issues I've seen in 8.1
have been resolved with the aforementioned tunables. 8.3 is definitely
worth a shot though, many changes between.
___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-07-09 Thread Paul Gear
On 09/07/12 06:01, Chris Buechler wrote:
> ...
>> Can anyone comment on the quality of the Broadcom driver in post-8.1
>> releases?  Is there any way to run a more recent kernel in conjunction
>> with pfSense?
>>
> 
> Short of running 2.1, not easily. I do have some customer systems
> running 2.1 in production for several months because 8.1 didn't
> support some component (RAID card IIRC) that 8.3 does.

I'm happy to consider running 2.1 in production.  Is reason to believe
that the Broadcom drivers are considerably improved in the 8.3 kernel?

Paul

___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-07-08 Thread Chris Buechler
On Sat, Jul 7, 2012 at 3:26 AM, Paul Gear  wrote:
> On 07/07/12 14:33, Adam Van Ornum wrote:
>>>
>>> FreeBSD's driver apparently is much improved in later releases (remember,
>>> pfSense is based on 7.3, which is quite a few years old now), so it's
>> just
>>> a matter of waiting until pfSense 2.1(???) comes out, based on FreeBSD 8,
>>> or 9, or whatever the next evolutionary step is.
>>
>> For what it's worth, pfSense 2.0.1 is based on FreeBSD 8.1, not 7.3, and
>> I believe that 2.1 is going to be on FreeBSD 8.3.
>
> Yes - my pfSense shows up as being FreeBSD 8.1-RELEASE-p6.
>
> Can anyone comment on the quality of the Broadcom driver in post-8.1
> releases?  Is there any way to run a more recent kernel in conjunction
> with pfSense?
>

Short of running 2.1, not easily. I do have some customer systems
running 2.1 in production for several months because 8.1 didn't
support some component (RAID card IIRC) that 8.3 does.
___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-07-07 Thread Paul Gear
On 07/07/12 14:33, Adam Van Ornum wrote:
>>
>> FreeBSD's driver apparently is much improved in later releases (remember,
>> pfSense is based on 7.3, which is quite a few years old now), so it's
> just
>> a matter of waiting until pfSense 2.1(???) comes out, based on FreeBSD 8,
>> or 9, or whatever the next evolutionary step is.
> 
> For what it's worth, pfSense 2.0.1 is based on FreeBSD 8.1, not 7.3, and
> I believe that 2.1 is going to be on FreeBSD 8.3.

Yes - my pfSense shows up as being FreeBSD 8.1-RELEASE-p6.

Can anyone comment on the quality of the Broadcom driver in post-8.1
releases?  Is there any way to run a more recent kernel in conjunction
with pfSense?

Paul

___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-07-06 Thread Adam Van Ornum

> 
> FreeBSD's driver apparently is much improved in later releases (remember, 
> pfSense is based on 7.3, which is quite a few years old now), so it's just 
> a matter of waiting until pfSense 2.1(???) comes out, based on FreeBSD 8, 
> or 9, or whatever the next evolutionary step is.

For what it's worth, pfSense 2.0.1 is based on FreeBSD 8.1, not 7.3, and I 
believe that 2.1 is going to be on FreeBSD 8.3. 
   ___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-07-06 Thread Adam Thompson
> It's disappointing that FreeBSD's Broadcom drivers are so far behind
> VMware & Linux - is there anything we can do (as non-kernel hackers)
> to help get this very common chipset well-supported?  Would throwing
> some $$$ at a pfSense or FreeBSD developer be useful?

FreeBSD's driver apparently is much improved in later releases (remember, 
pfSense is based on 7.3, which is quite a few years old now), so it's just 
a matter of waiting until pfSense 2.1(???) comes out, based on FreeBSD 8, 
or 9, or whatever the next evolutionary step is.  I'm not sure it's 
feasible to back-port the newer drivers, there have been many changes 
between FreeBSD-7.3-RELEASE and FreeBSD-CURRENT.

FWIW, if you ran a modern Dell server under Red Hat Advanced Server 2.1, 
you'd have problems, too.

Feel free to support the pfSense project financially, however, as will 
undoubtedly help things along in a general fashion.  (BTW: I have the same 
issues with Broadcom LOMs on a few servers - I now give up at the first 
sign of problems and install a dual-port Intel PCI/PCIe/PCI-X NIC and I've 
not had any problems with those.

-Adam Thompson
 athom...@athompso.net



___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-07-06 Thread Paul Gear
On 29/06/12 11:07, Paul Gear wrote:
> ...
> Server hardware: IBM x3550, Xeon E5405 2 GHz, 2 GB RAM, 2 x 300 GB 10K
> RPM SAS HD in hardware RAID 1, 2 x Broadcom NetXtreme II BCM5708
> 1000Base-T (B2)
> ...
> Hope that all makes sense.  My gut/experience tells me this is a NIC
> driver bug/deficiency.  This hardware is 100% stable on Linux, but there
> really aren't any Linux distributions that will do what we want without
> some customisation, so the client would prefer to get pfSense working.
> Any suggestions on where to go next?

Just an update on this for anyone who cares (and anyone who can do
anything about the Broadcom driver): i reinstalled the same system on
ESXi 5 and ran the same version of pfSense as a VM with individual vNICs
for each VLAN, and it has been flawless.  Even pings to the LAN gateway
which were unreliable before are now solid.

It's disappointing that FreeBSD's Broadcom drivers are so far behind
VMware & Linux - is there anything we can do (as non-kernel hackers) to
help get this very common chipset well-supported?  Would throwing some
$$$ at a pfSense or FreeBSD developer be useful?

Thanks,
Paul

___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-06-29 Thread Vick Khera
On Thu, Jun 28, 2012 at 9:07 PM, Paul Gear  wrote:

> Server hardware: IBM x3550, Xeon E5405 2 GHz, 2 GB RAM, 2 x 300 GB 10K
> RPM SAS HD in hardware RAID 1, 2 x Broadcom NetXtreme II BCM5708
> 1000Base-T (B2)
>

About two weeks ago I had to put into production a temporary hacked
together server as my primary firewall.  I used a spare Dell PE1750 (32-bit
Xeon processor) which had two broadcom gig-e on-board, and added in a
cheap-o 100baseTX card to use as the WAN port.

This solution worked really well until such time that the WAN was saturated
at about 98Mbps.  At that time, one of the broadcom NICs would lock up and
get reset on a watchdog timeout.  This conveniently caused failover to the
other pfSense box sync'd with it (which unfortunately could not handle the
load).  pfSense never auto-switched back -- I had to manually re-run one of
the rc scripts to reset everything.

After that, I splurged on an Intel gig-e NIC for the WAN, and everything
was stable again.  No more watchdogs on the bge NIC.

Both of these have since been replaced with a pair of Silicon Mechanics
R101 boxes with low-power-consumption Xeon CPUs.  These have been working
very nicely to push upwards of 170Mbps for sustained periods of a few hours
at a time.  CPU load < 8%, and sucking down very little power at the same
time.  They have 4x Intel NICs in them.
___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-06-29 Thread jerome alet
Hi,

> 
> From: Adam Thompson 
>
> You're largely correct, pfSense has - sometimes - issues with Broadcom NICs.
> If you search the mailing list archives and the bug tracker you'll see a 
> number of reports/complaints.
> Many of these issues have been fixed since the 1.x era, but there are still 
> occasional compatibility issues.
> The NIC troubleshooting steps often resolve the issue (at least well enough 
> for daily use), but not always.  IIRC, there are a couple modern Dell 
> PowerEdge servers (R700, maybe?) 
> that essentially can't be used with pfSense's NIC drivers at all.  It's 
> possible your IBM is going to be another problematic platform until the 
> project releases a FreeBSD-9-based 
> version.

I can confirm that brand new Dell R610 won't work with stable release because 
of missing driver for the RAID controller. Devel snapshots of 2.1 work wrt disk 
controller, but requires some tweaks to /boot/loader.conf.local to fix network 
issues with Broadcom NIC's, as well as 4 ports Intel NICs... Once you've put 
the fixes in, network seems to work fine and the machine doesn't behave 
erratically. Although we're still doing tests and we complexify our setup each 
day : >15 vlans and 2 unrelated wan links (two sets of clients) all with carp 
failover, squid and so on, we're confident it now works as expected with this 
hardware.

hth

-- 
Jerome Alet
___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-06-29 Thread Adam Thompson
You're largely correct, pfSense has - sometimes - issues with Broadcom NICs.
If you search the mailing list archives and the bug tracker you'll see a number 
of reports/complaints.
Many of these issues have been fixed since the 1.x era, but there are still 
occasional compatibility issues.
The NIC troubleshooting steps often resolve the issue (at least well enough for 
daily use), but not always.  IIRC, there are a couple modern Dell PowerEdge 
servers (R700, maybe?) that essentially can't be used with pfSense's NIC 
drivers at all.  It's possible your IBM is going to be another problematic 
platform until the project releases a FreeBSD-9-based version.
I've only ever heard of these problems affecting LOMs (onboard ports) but that 
could be coincidence...
I know I've done two similar Dell servers where one works great and the other 
now has a dual-port Intel NIC card in it, as that works much more reliably.
Good luck,
-Adam
___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-06-28 Thread David Burgess
On Thu, Jun 28, 2012 at 10:37 PM, Paul Gear  wrote:

> Would i be better off virtualising this system on VMware?  That way i
> could handle all the VLAN tagging in the hypervisor, and the NIC
> presented to the system would be an Intel E1000 instead of a Broadcom.
> The VMware ESXi 5 and Linux drivers for these NICs are rock-solid in my
> experience.

Might be worth a try.


> My netstat -m output is shown below.  It's not even close to the limits.
>  Keep in mind that this is just a test system.  There is no traffic
> going through it.  Except for my initial configuration on Tuesday,
> basically nothing has been done to the box except for ping & SNMP from
> our NMS.

Probably not the problem then, I would think.


> Are you saying that you regularly run out of MBUFs and are forced to
> reboot when you do?

Before setting them to 131072, yes. Since making the change, my
longest uptime is 72 days, 15:43 with netstat -m:

71214/2006/73220 mbufs in use (current/cache/total)
71107/1045/72152/131072 mbuf clusters in use (current/cache/total/max)
71107/701 mbuf+clusters out of packet secondary zone in use (current/cache)
0/90/90/65536 4k (page size) jumbo clusters in use (current/cache/total/max)
12/1324/1336/32768 9k jumbo clusters in use (current/cache/total/max)
0/0/0/16384 16k jumbo clusters in use (current/cache/total/max)
184411K/15369K/199780K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

I record the output of netstat -m daily, and even at 72 days uptime it
was still growing geometrically.

db
___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-06-28 Thread Paul Gear
On 29/06/12 14:26, David Burgess wrote:
> On Thu, Jun 28, 2012 at 10:11 PM, Paul Gear  wrote:
>>
>> What should be my next troubleshooting step?

Hi David,

> memtest? 

I ran a firmware/BIOS upgrade and followed that with a full (successful)
memtest before installing pfSense.

> Different NICs?

I don't really have any other suitable ones at my disposal.  I can
probably ask the client to buy one, but i have 5 of these servers and i
really would rather not buy extra NICs for all of them.

Would i be better off virtualising this system on VMware?  That way i
could handle all the VLAN tagging in the hypervisor, and the NIC
presented to the system would be an Intel E1000 instead of a Broadcom.
The VMware ESXi 5 and Linux drivers for these NICs are rock-solid in my
experience.

> Have you looked at your MBUF usage (netstat -m)?

My netstat -m output is shown below.  It's not even close to the limits.
 Keep in mind that this is just a test system.  There is no traffic
going through it.  Except for my initial configuration on Tuesday,
basically nothing has been done to the box except for ping & SNMP from
our NMS.

> I get similar
> symptoms after running out of MBUFs, but if you followed the first
> step in the doc you linked then you should have plenty. I use
> "kern.ipc.nmbclusters="131072"" here and it takes me ~130 days to top
> out, but YMMV.

Are you saying that you regularly run out of MBUFs and are forced to
reboot when you do?

Regards,
Paul

- 8< -

$ netstat -m
16323/1730/18053 mbufs in use (current/cache/total)
16322/1476/17798/131072 mbuf clusters in use (current/cache/total/max)
16321/831 mbuf+clusters out of packet secondary zone in use (current/cache)
0/54/54/65536 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/32768 9k jumbo clusters in use (current/cache/total/max)
0/0/0/16384 16k jumbo clusters in use (current/cache/total/max)
40805K/4033K/44838K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-06-28 Thread David Burgess
On Thu, Jun 28, 2012 at 10:11 PM, Paul Gear  wrote:
>
> What should be my next troubleshooting step?
>

memtest? Different NICs?

Have you looked at your MBUF usage (netstat -m)? I get similar
symptoms after running out of MBUFs, but if you followed the first
step in the doc you linked then you should have plenty. I use
"kern.ipc.nmbclusters="131072"" here and it takes me ~130 days to top
out, but YMMV.

db
___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-06-28 Thread Paul Gear
On 29/06/12 11:56, Chris Buechler wrote:
> On Thu, Jun 28, 2012 at 9:07 PM, Paul Gear  wrote:
>> ...
> In short, it sounds like you're on the right path with the NIC-related
> changes you've made, that's generally where issues may arise with the
> Broadcom cards.

Hi Chris,

As i mentioned in my follow-up post, none of the NIC-related changes has
had the desired effect.  What should be my next troubleshooting step?

Thanks,
Paul


___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-06-28 Thread Paul Gear
On 29/06/12 11:07, Paul Gear wrote:
> ...
> Thanks to databeestje on the ##pfsense IRC channel, who pointed me to
> the wiki instructions for NIC troubleshooting [9].  I tried the first
> set of boot loader parameters last night.  The result of this was that
> ping & SNMP still stopped after about 4 hours, but HTTP was still OK
> this morning.  I've implemented the parameters in the "Packet loss with
> many (small) UDP packets" section this morning, and the system is still
> up, but we're only just getting up to the 5 hour mark, and one of the
> crashes was after about 12 hours.

Update: the workaround for UDP packet loss did not work, and the NIC has
hung again now after about 180 minutes of responsiveness.

Additional info that the guys in IRC said might be relevant: all of the
interfaces on the system (6 of them), both LAN & WAN, are tagged VLANs.

Paul

___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


Re: [pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-06-28 Thread Chris Buechler
On Thu, Jun 28, 2012 at 9:07 PM, Paul Gear  wrote:
> Hi all,
>
> I'm testing pfSense for a client, looking to put it into an existing
> production network some time in the next month or two.  (Some background
> at [1], if anyone cares...)  In terms of features and interface it is a
> win, but we're having massive problems with stability that seem to be
> related to the NIC driver for Broadcom bce cards.
>
> pfSense version: 2.0.1 amd64, one additional package installed: OpenOSPFd
>
> Server hardware: IBM x3550, Xeon E5405 2 GHz, 2 GB RAM, 2 x 300 GB 10K
> RPM SAS HD in hardware RAID 1, 2 x Broadcom NetXtreme II BCM5708
> 1000Base-T (B2)
>
> The basic symptom of the problem is that the box stops responding to
> ping and SNMP, and sometimes HTTP/S, after about 4 hours.  Some graphs
> from our NMS showing this can be found at [2] and [3].
>
> When this happens, the console is still fully operational, and i can log
> in and do normal shell stuff, including looking at the logs.  There's
> corruption at the end of each syslog file (see [4], [5], and [6]).

Not corruption, see:
http://doc.pfsense.org/index.php/Why_can't_I_view_view_log_files_with_cat/grep/etc%3F_(clog)


In short, it sounds like you're on the right path with the NIC-related
changes you've made, that's generally where issues may arise with the
Broadcom cards.
___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list


[pfSense] Network "freezes" on IBM x3550, Broadcom NICs

2012-06-28 Thread Paul Gear
Hi all,

I'm testing pfSense for a client, looking to put it into an existing
production network some time in the next month or two.  (Some background
at [1], if anyone cares...)  In terms of features and interface it is a
win, but we're having massive problems with stability that seem to be
related to the NIC driver for Broadcom bce cards.

pfSense version: 2.0.1 amd64, one additional package installed: OpenOSPFd

Server hardware: IBM x3550, Xeon E5405 2 GHz, 2 GB RAM, 2 x 300 GB 10K
RPM SAS HD in hardware RAID 1, 2 x Broadcom NetXtreme II BCM5708
1000Base-T (B2)

The basic symptom of the problem is that the box stops responding to
ping and SNMP, and sometimes HTTP/S, after about 4 hours.  Some graphs
from our NMS showing this can be found at [2] and [3].

When this happens, the console is still fully operational, and i can log
in and do normal shell stuff, including looking at the logs.  There's
corruption at the end of each syslog file (see [4], [5], and [6]).  The
firewall itself can ping out [7], but apinger thinks that the LAN
gateway is dead [5] even though our smokeping installation says that
it's fine [8].

Thanks to databeestje on the ##pfsense IRC channel, who pointed me to
the wiki instructions for NIC troubleshooting [9].  I tried the first
set of boot loader parameters last night.  The result of this was that
ping & SNMP still stopped after about 4 hours, but HTTP was still OK
this morning.  I've implemented the parameters in the "Packet loss with
many (small) UDP packets" section this morning, and the system is still
up, but we're only just getting up to the 5 hour mark, and one of the
crashes was after about 12 hours.

Hope that all makes sense.  My gut/experience tells me this is a NIC
driver bug/deficiency.  This hardware is 100% stable on Linux, but there
really aren't any Linux distributions that will do what we want without
some customisation, so the client would prefer to get pfSense working.
Any suggestions on where to go next?

Thanks in advance,
Paul

[1] http://libertysys.com.au/content/experimenting-with-pfsense
[2]
https://picasaweb.google.com/113106441554518621156/StrangePfSenseNetworkHang#5759246924791982882
[3]
https://picasaweb.google.com/113106441554518621156/StrangePfSenseNetworkHang#5759246877106452754
[4]
https://picasaweb.google.com/113106441554518621156/StrangePfSenseNetworkHang#5758981314745894850
[5]
https://picasaweb.google.com/113106441554518621156/StrangePfSenseNetworkHang#5758981689439502626
[6]
https://picasaweb.google.com/113106441554518621156/StrangePfSenseNetworkHang#5758982141852764962
[7]
https://picasaweb.google.com/113106441554518621156/StrangePfSenseNetworkHang#5758983473295303378
[8]
https://picasaweb.google.com/113106441554518621156/StrangePfSenseNetworkHang#5758983828538086034
[9]
http://doc.pfsense.org/index.php/Tuning_and_Troubleshooting_Network_Cards

___
List mailing list
List@lists.pfsense.org
http://lists.pfsense.org/mailman/listinfo/list