Re: FreeBSD -STABLE servers repeatedly crashing.

2005-07-18 Thread Matt Juszczak
For me, 5 days up time after switching from IPF to PF. Before the switch a 
couple of hours of uptime was the maximum. Seems like the crashes are caused 
by ipfilter.



Still same for me :)  Uptime almost 20 days now after switching to PF.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD -STABLE servers repeatedly crashing.

2005-07-18 Thread Matt Juszczak

I find this messages kind of weird. Are you saying your servers only run long 
periods of uptime with pf and *not* with ipf? I run a server and almost never 
put it down. IPF performs very well, including a lot of natting for my home 
network.


Correct.  IPF is unstable with our SMP (most of the time) - based 5.x 
boxes.  VERY unstable.  VERY VERY unstable.


-Matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD -STABLE servers repeatedly crashing.

2005-07-12 Thread Matt Juszczak
Yes, there is absolutely no difference. Disabled HTT in the BIOS and in 
FreeBSD, the box still crashes.


Matt again :)

So far a 13 day up time after switching from IPF to PF.  If thats not the 
problem, I hope I find it soon considering this is a production server ... 
but it seems to be more stable.


*Knock On Wood*

-Matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Possible exploit in 5.4-STABLE

2005-07-01 Thread Matt Juszczak
What are the chances of a base 5.4-RELEASE system with PF and securelevel 
2 and updated packages being cracked and rooted?  Is this something that 
occurs every day?  Or is it difficult?

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Two Options: which to choose?

2005-06-30 Thread Matt Juszczak

Hi all,

Removing IPF for 5.4-STABLE seems to have made the boxes stable.  I 
switched all the firewalls to PF and they haven't crashed since, its been 
about 3 days now... (before they were crashing every 12 hours).


Here are my worries:

1)  If I were to put this machine into production, it could crash at any 
time for another reason... or maybe the switch to PF hasn't actually 
stabalized it, and its just playing games with me.


2)  If it crashes again, I might lose some responsibilities at work due to 
trust and/or inabilities.


So here's the thing  5.4-STABLE is a great OS, I run it on other 
single processor machines... but obviously it doesn't seem to like the 
three servers I have it setup on at work that keep crashing (or atleast 
kept crashing until recently).


Therefore, part of me is thinking of switching back to either 4.11 or to 
OBSD 3.7.  Problem is, this switch wouldn't be temporary, it would have to 
be permanant.  I couldn't set things up now and then move them again a 
month from now.  4.11-STABLE is stable, but it would have to be our 
solution for at least one or two years...


So my debate is whether to choose FreeBSD 4.11-STABLE which I know is 
stable, but isn't actively developed and/or patched (except for security 
patches), or choose something like OBSD 3.7 which I know is stable and is 
also actively developed ... but then again, maybe OpenBSD will have 
trouble with these three servers too, knowing my luck.


What's everyone's opinion?  I've had replies to previous posts telling me 
to go back to 4.11 temporarily, or to at least get something stable 
while you work on something new, but ... I'd like to do this all in one 
shot.


PS: I don't mind TESTING stability ... as long as the box isn't crashing, 
I can hold off a few weeks for stability testing before i do a switch 
over.  I'm not looking to do a psycho load-it, install-it, configure-it, 
switch-it in 24 hours thing


Regards,

Matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Two Options: which to choose?

2005-06-30 Thread Matt Juszczak

After changing to PF I did not notice single crash for month (production
servers with, sometimes, heavy load).

I would try FreeBSD with PF anyway. Works perfectly.


You say it didn't crash for a month, but then you say to try FreeBSD with 
PF because it works perfectly.  To me, a month of uptime isn't perfectly. 
Can you elaborate?  Is your machine still crashing even though its taking 
a month instead of a few days like it did previously?


Thanks,

Matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Two Options: which to choose?

2005-06-30 Thread Matt Juszczak

Could you not use pfsync to mitigate the problem (at least
partially)? As for your original question, I think its less
work to change your hardware to something you know works than
changing operating systems. Why not use single CPU machines
for this?


My boss refuses :-(
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: On recent crashes

2005-06-29 Thread Matt Juszczak

If you experience panics on FreeBSD then you need to follow the advice
in the developers' handbook chapter on kernel debugging and obtain the
necessary information so that a developer can begin to investigate
this problem.


I've tried :-( It locks up before it can do a dump.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD -STABLE servers repeatedly crashing.

2005-06-29 Thread Matt Juszczak



On Wed, 29 Jun 2005, Kris Kennaway wrote:


On Tue, Jun 28, 2005 at 11:26:06AM -0400, Matt Juszczak wrote:


OK, when it crashes next and is sat at the db prompt, type tr and
press enter to get a trace.  Copy this down (or have a serial console to
capture the output).  Also, try typing call doadump() and see if that
succeeds in generating a crash dump.  How were you trying to generate
one before?

Gavin




I can't type anything.  The machine locks up.

See: http://paste.atopia.net/126

After CPUID: 1, the machine locks cold and nothing else is printed to
the screen.


Try two things:

1) adding 'options KDB_STOP_NMI' to your kernel config.

2) If you still can't get it to break to DDB, then compile up a
debugging kernel, run kgdb on it (as described in the developers'
handbook), and list *(0xblah) where that address is the value of the
instruction pointer in the trap message (e.g. 0xc6644eff in your paste
above).  That might at least be a start.

Kris



OK :) I'll try this next time it crashes.  I actually disabled ipf a few 
nights ago and it hasn't crashed since... knock on wood.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD -STABLE servers repeatedly crashing.

2005-06-28 Thread Matt Juszczak

Gleb Smirnoff wrote:


On Mon, Jun 27, 2005 at 01:01:09AM -0400, Matt Juszczak wrote:
M About three weeks ago, I upgraded my 5.3-RELEASE boxes to 5.4-RELEASE.  
M I also turned on procmail globally on our mail server.  Here is our 
M current FreeBSD server setup:
M 
M URANUS  -  primary ldap

M CALIBAN -  secondary ldap
M ORION -  primary mail
M 
M Orion was the first one to crash, about three weeks ago.  Orion is 
M constantly talking to uranus, because uranus is our primary ldap server 
M (we have a planet scheme), and caliban is our secondary ldap server.  I 
M ran an email flood test on orion to see if I could crash it again.  This 
M time, the high requests on Uranus caused Uranus to crash. With two 
M different servers on two different hardware setups crashing, I had to 
M start thinking of what could be causing the problem.
M 
M Memory tests on both servers came back OK.  Orion had some ECC errors 
M which it was able to fix.  I wasn't able to catch orion's first crash, 
M but I was able to catch uranus's first crash:
M 
M http://paste.atopia.net/126


Can you please build kernel with debugging and obtain a crashdump?


 




Ever since I setup the debug kernel the machine is now crashing every 12 
hours.  I think I have to switch to OpenBSD or 4.11 FreeBSD because this 
box can't keep crashing.  It  refuses to do a crash dump.


-Matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD -STABLE servers repeatedly crashing.

2005-06-28 Thread Matt Juszczak

Gavin Atkinson wrote:


On Tue, 2005-06-28 at 10:49 -0400, Matt Juszczak wrote:
 


Gleb Smirnoff wrote:

   


On Mon, Jun 27, 2005 at 01:01:09AM -0400, Matt Juszczak wrote:
M About three weeks ago, I upgraded my 5.3-RELEASE boxes to 5.4-RELEASE.  
M I also turned on procmail globally on our mail server.  Here is our 
M current FreeBSD server setup:
M 
M URANUS  -  primary ldap

M CALIBAN -  secondary ldap
M ORION -  primary mail
M 
M Orion was the first one to crash, about three weeks ago.  Orion is 
M constantly talking to uranus, because uranus is our primary ldap server 
M (we have a planet scheme), and caliban is our secondary ldap server.  I 
M ran an email flood test on orion to see if I could crash it again.  This 
M time, the high requests on Uranus caused Uranus to crash. With two 
M different servers on two different hardware setups crashing, I had to 
M start thinking of what could be causing the problem.
M 
M http://paste.atopia.net/126


Can you please build kernel with debugging and obtain a crashdump?
 

Ever since I setup the debug kernel the machine is now crashing every 12 
hours.  I think I have to switch to OpenBSD or 4.11 FreeBSD because this 
box can't keep crashing.  It  refuses to do a crash dump.
   



OK, when it crashes next and is sat at the db prompt, type tr and
press enter to get a trace.  Copy this down (or have a serial console to
capture the output).  Also, try typing call doadump() and see if that
succeeds in generating a crash dump.  How were you trying to generate
one before?

Gavin
 



I can't type anything.  The machine locks up.

See: http://paste.atopia.net/126

After CPUID: 1, the machine locks cold and nothing else is printed to 
the screen.


-Matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD -STABLE servers repeatedly crashing.

2005-06-28 Thread Matt Juszczak




fsck -y# or fsck and read every question, if you're paranoid
mount -f /# remounts root read/write
mount /var
savecore /var/crash
exit

Gary


Gary:

After it crashes, it locks up and hangs, no keyboard response, etc.  
When I reboot, I go into single user mode and do:


fsck -p
mount -a -t ufs
savecore /var/crash /dev/da0s1b (which is my swap)

It says no dump available.  These instructions are from the handbook.

I just got sent a patch a little while ago which apparently will help 
the system not lock up.  I'm going to try it later today and see where 
it gets me.


Thanks,

Matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: On recent crashes

2005-06-28 Thread Matt Juszczak

Vivek Khera wrote:



On Jun 28, 2005, at 5:20 AM, Alex Povolotsky wrote:


Can anyone enlighten me, if recent crashes are on STABLE only or
5.4-RELEASE is affected as well?



I have three boxes running 5.4-RELEASE.  one is a mediumly-loaded web  
server, and two are very heavily loaded database servers.  none of  
them ever crash.




Other people I've seen complain seem to be running SMP Not sure if 
that has anything to do with it but its the only similiarity I can pull 
out from any responses I've gotten.


-Matt

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: On recent crashes

2005-06-28 Thread Matt Juszczak

Chris Phillips wrote:


Vivek Khera wrote:



On Jun 28, 2005, at 5:20 AM, Alex Povolotsky wrote:


Can anyone enlighten me, if recent crashes are on STABLE only or
5.4-RELEASE is affected as well?



I have three boxes running 5.4-RELEASE.  one is a mediumly-loaded 
web  server, and two are very heavily loaded database servers.  none 
of  them ever crash.





Matt Juszczak wrote:

Other people I've seen complain seem to be running SMP Not sure 
if that has anything to do with it but its the only similiarity I can 
pull out from any responses I've gotten.




I have 5 modestly powered i386 boxes on 5.4-RELEASE and the only time 
I have had any complaints regarding system stability, is when running 
an SMP kernel AND Nagios (which is a known problem - I think it's with 
Nagios rather than FreeBSD).  Otherwise, I'm almost completely happy.




Nagios remotely or locally?  I have nagios remotely that PINGS these 
machines constantly for uptime/downtime checks, but nagios isn't 
actually running on them as a process...


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: On recent crashes

2005-06-28 Thread Matt Juszczak


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: FreeBSD -STABLE servers repeatedly crashing

2005-06-28 Thread Matt Juszczak

Please try out this patch to aid the above problem with hang instead of
dump:

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/i386/trap.c.diff?r1=1.275r2=1.276


This box is now crashing once every 12 hours.  I can't apply this patch 
:-(.  Does anyone have any suggestions on how I can work around this? 
Some have said its an SMP problem and some have said its a 4 GB RAM 
problem and some have said its an IPF problem  if I disabled all three 
of those things would that help this box be stable until code could be 
fixed?


Thanks,
Matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD -STABLE servers repeatedly crashing

2005-06-28 Thread Matt Juszczak

Matt,

Sadly the FreeBSD guys will need more info before a fix is possible. I would 
suggest you revert back to FreeBSD 5.3, if you can. Even if you get a patch 
you'd want to do a whole lot of regression testing before putting it in 
production as it might break something else.


Gary,

Do you know what the chances are that this problem I'm experiencing is SMP 
related?  I don't mind turning off SMP, and I guess I could for now to see 
if that runs stable.  Otherwise, I think we're going to switch to OpenBSD, 
because these crashes are occuring so frequently (twice a day)... and as 
far as the patch and regression testing, if someone sent me a patch right 
now I would put it on the server, because the server already crashes 
daily, so a faulty patch wouldn't change much :-(.


I appreciate your response.  I'm going to do a little more research today 
before i make my decision on a platform switch.


-Matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD -STABLE servers repeatedly crashing

2005-06-28 Thread Matt Juszczak

Only way to find out is to try. You could build and install the non-SMP
kernel and reboot when you can, or let it boot the new kernel next time
the system(s) crash.

A lot of the issues seem to be SMP-related. I really loaded up a GENERIC
5.4 kernel and wasn't able to get it to panic. What do you have to lose
at this point?

I would suggest that before committing to OpenBSD you verify that all
the hardware/software you have/use is supported under OpenBSD:

http://www.daemonnews.org/200104/bsd_family.html
http://www.monkey.org/openbsd/archive/misc/0311/msg01803.html

As an example: I'm fairly sure OpenBSD has recently dropped (or will drop) 
support for the Adaptec aac driver as Theo is not happy with Adaptec's 
response to his queries for interface specs.


From what I've head (YMMV) OpenSBD SMP support is not very optimal, possibly 
because it is likely that it was implemented extremely conservatively. 
OpenBSD MySQL with two CPUs can be slower than with one:


http://software.newsforge.com/article.pl?sid=04/12/27/1243207from=rss

Gary

ps. it is a case of: cost, speed, reliability - choose any two.




Agreed, Theo just yelled at me cause I was having this discussion on the 
OpenBSD misc mailing list, which is my fault :-/ ... a lot of people were 
responding though and I think it just got out of hand.


As much as OpenBSD seems nice, my FreeBSD experience is a lot better.  I'm 
going to switch to Uniprocessor and see if that makes us more stable. 
Hopefully it will.


-Matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD -STABLE servers repeatedly crashing

2005-06-28 Thread Matt Juszczak


Hi,

I have something like 20 boxes (Dell Power Edge 370, Fujitsu-Siemens PRIMERGY 
200 and couple of dual AMD64 Fujitsu-Siemens) servers running 5.4-STABLE. So 
far, only machine that I have experienced freezing and was unable to get 
droped into KDB or to get any sort of vmcore was Dell Power Edge 1600SC (dual 
Xeon 2.4GHz with 4Gb). I have noticed that since it was running squid-2.5 
linked to pthread when I have switched to oops which was compiled on 5.2.1 
and linked to libc_r that machine stoped crashing (HTT disabled, IPFILTER 
also disabled configuration GENERIC). However, I have decided to experiment 
and upgraded to 6.0-CURRENT and so far I haven't experienced any problems - 
except one panic caused by linux.ko and running edonkeyclc for linux (it was 
just experiment to see if it will work on 6.0-CURRENT). I suppose that there 
might be some problems related to SMP on 5.4 and I don't know what for are 
you using problematic servers and I don't know if it is smart to use 
6.0-CURRENT but so far I have positive experince with it on problematic 
server and would rather stay with FBSD then switching to NetBSD or OpenBSD.



With what you're saying, maybe my problem is that I use IPFILTER and maybe 
it isn't an SMP problem?  Should I switch to PF?


-Matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD -STABLE servers repeatedly crashing

2005-06-28 Thread Matt Juszczak

Some people suggested so - pf is supposed to be faster then IPFILTER.
However if you are experiencing machine freezing like I did on 5.4-STABLE
I'm not sure this will help - if nothing else helps try 6.0-CURRENT. I've 
also noticed that it is running much faster with all debuging enabled then 
regular 5.4-STABLE on same hardware...


I dont think its a good idea to run 6.0-CURRENT production.

I'm moving the main mail server to PF, keeping SMP on.  Its also running 
5.4-STABLE as of today.  We'll see if any of this fixes anything.


Regards,

Matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD -STABLE servers repeatedly crashing

2005-06-28 Thread Matt Juszczak


Yes, SMP is enabled, as is implied by the kernel config tag.

(Very busy compilation, web and database server)



Are you using PF?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD -STABLE servers repeatedly crashing.

2005-06-27 Thread Matt Juszczak

Can you please build kernel with debugging and obtain a crashdump?



High activity on the box today caused us to be able to crash it again 
within 9 hours.  I configured all steps per the developers handbook, but 
when I went to do savecore, it said no dumps.


It appears the machine is completely locked up when it does a kernel trap. 
The keyboard is non-responsive, and the machine hangs and doesn't reboot.


Any other suggestions would be greatly appreciated.  For now I am going to 
take the box out of SMP mode which will hopefully keep it stable until I 
can find some further instructions.


Regards,

Matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


FreeBSD -STABLE servers repeatedly crashing.

2005-06-26 Thread Matt Juszczak

Hello all,

About three weeks ago, I upgraded my 5.3-RELEASE boxes to 5.4-RELEASE.  
I also turned on procmail globally on our mail server.  Here is our 
current FreeBSD server setup:


URANUS  -  primary ldap
CALIBAN -  secondary ldap
ORION -  primary mail

Orion was the first one to crash, about three weeks ago.  Orion is 
constantly talking to uranus, because uranus is our primary ldap server 
(we have a planet scheme), and caliban is our secondary ldap server.  I 
ran an email flood test on orion to see if I could crash it again.  This 
time, the high requests on Uranus caused Uranus to crash. With two 
different servers on two different hardware setups crashing, I had to 
start thinking of what could be causing the problem.


Memory tests on both servers came back OK.  Orion had some ECC errors 
which it was able to fix.  I wasn't able to catch orion's first crash, 
but I was able to catch uranus's first crash:


http://paste.atopia.net/126

I have the other crashes written down in pencil at my work.  They all 
say mostly the same thing.  I assume Caliban would also experience this 
behavior, but because it does not receive much load at all (only does 
anything when uranus dies), I am not able to confirm this.


The only thing similar between the boxes is that all three have two 
processors in them, and are running SMP.  Orion had hyperthreading 
turned on but I disabled this in the bios, to no avail.


Someone with similar experiences running SMP informed to upgrade to 
-STABLE as of last week.  For almost a week, Orion ran fine.  This 
evening; however, Orion once again crashed, its fourth time in three 
weeks.  Uranus has been stable for a few days but I am expecting it to 
crash again any day now (they usually take between 4-6 days).


So now I am stuck.  I have two -STABLE machines which continue to cause 
kernel traps.  Tomorrow, I am going to compile a debugging kernel on 
orion and try to let it crash again to see what kind of errors it 
reports, but I was wondering if anyone else is experiencing these problems.


Thanks in advance,

Matt Juszczak
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]