On Tue, 16 May 2006, [EMAIL PROTECTED] wrote:
Date: Tue, 16 May 2006 08:15:48 +0200
From: [EMAIL PROTECTED]
To: Jeff A. Earickson <[EMAIL PROTECTED]>
Cc: Toby Chappell <[EMAIL PROTECTED]>, [email protected]
Subject: Re: Sun's support of IPF
I've run Ipfilter 3.x on Solaris 8 and S9 boxes for several years, and
now Ipfilter 4.x on S10 boxes. Systems occasionally panic; most often
in my opinion from bad hardware. However, Sun's first reaction when
they look at the traceback from the crash dump is to blame ipfilter;
ipfilter always shows up near the top of the traceback because it is
loaded into the kernel and active. So they point the finger there.
I've usually been able to convince them that it was bad hardware due
to other evidence, eg parity errors, lom output, etc. If ipfilter really
does cause a panic or hang, it is usually obvious. The system dies right
after ipfilter is loaded and there is discussion about the issue on the
ipfilter list. Sun need not get involved...
If the system dies with ipfilter in the stack trace, it's ipfilter
which is to blame; blaming it on bad hardware is ludicrous.
If it's in the stack, its code is directly or indirectly involved in the
panic. Even if it is "always active", the amount of time spend in
ipfilter code is but a small percentage of total system time.
Statistically speaking, random panics would then not happen with
ipfilter on the stack or they would happen in any of the other scores
of kernel threads.
I have 23 Sun systems, 13 S8 or S9 systems with ipfilter 3.4.31, and 10
boxes running Solaris 10 with pfil 2.1.10 and ipfilter 4.1.13. The
S8 and S9 systems are frozen at 3.4.31 (I've had problems with later
3.x versions). The S10 boxes were a mish-mash of various pfil/ipf 4.x
releases until last week, when I got everything at 2.1.10/4.1.13.
Less critical boxes were running newer releases of pfil/ipf while more
critical boxes were running older versions.
Like I said, there is generally other evidence of hardware malfunction.
I've just noticed that Sun engineers, at least in the Solaris 9 era,
have been quick to blame ipfilter. If I have 13 machines that have
been running version 3.4.31 for months and one of them suddenly starts
falling over, then I'm less inclined to blame ipfilter than Sun engineers
are. Other evidence usually sorts out the issue.
My track record with S10 and Ipfilter 4.x is pretty spotless so far.
I have one box where ipfilter seems to interfere with Sun Update Manager;
I've been investigating that. I've had a V1280 with obvious and serious
hardware problems (hopefully fixed) caused by a power spike. It was S9,
now running S10. S10 has not yet paniced on me for any reason; ipfilter
or otherwise.
But are you uncomfortable with your Sun hanging out there in the
breeze, waiting to be poked by every hacker on the planet? I sure
am. I need the protection of ipfilter more than I need management's
blessing. I can get away with this attitude due to the local politics
and the fact that IPfilter has been rock solid for many years. If
ipfilter-using machines fell over all the time I would scrap it.
Ipfilter has caused its share of panics on my systems but is generaly
stable once you have a configuration which works.
While I understand that you require the protection of ipfilter,
what is it that you need from the bleeding edge version not
offered in Solaris 10?
My reasons are more cosmic than pragmatic. I don't need any of the new
features of the latest ipf, so I probably should run Sun's version. But
if nobody uses Darren's releases then he gets no feedback or practical
evaluation of his work. He then has no incentive to improve ipfilter.
I eagerly await the June 2006 of S10, with ZFS. I have two machines
slated for installation of this release, an E220R test box running S9
now and the V1280. I may leave Sun's version of ipfilter in place
on one or both for comparison/testing. But I will continue to support
Darren's public-domain efforts by using his work.
Jeff Earickson
Colby College