Re: random hangs/reboots with Dell servers

2007-04-21 Thread Dimitris Zilaskos


Thnx to everyone for your replies,

A colleague has provided me with his hand notes of an older crash screen, 
it has the following(however i cant guarantee it is accurate, it is 
handnotes).


Fatal trap 12: page fault while in kernel mode
cpuid=0; apicid=00
fault virtual address=0xac
fault code=supervisor write,page not present
instruction pointer=0x20:0x
current process 79962
trap numbers : 12
panic: pagefault
cpuid=1
uptime=6d7423m55


	I do not believe the problems are related to envriroment or 
electricity, since during the period the problems occured we have switched  data center, 
and in addition to dell systems there are 150 more nodes from various 
vendors (HP mostly, but also IBM, supermicro, SUN, and various assembled 
towers), and none has shown similar behaviour. We dont run FreeBSD on them 
though. We have a Dell 2850 with Windows 2003 that has been running rock 
solid for at least 1 year. And the 1750 that under FreeBSD 5 would 
sometimes crash even under no load, with RHEL 4 pushes 60 Mbps of ftp data 
24/7 with ease for the last year without any problems.


	Disabling everything from BIOS was one of our first moves, though 
we havent disabled usb since sometimes we need to connect a keyboard. And 
no IPMI is running on a public interface:)


	Apart from all the nodes being SMP and Dell, I cannot think of 
anything else in common. Some are SCSI, some are SATA. All have a number 
of jails. Memory size is 2 GB (the 1750), the others have 4 GB.


	I have also asked Dell for some help, though they told me freebsd 
is not certified by Dell, they will try to look into it.



--


Dimitris Zilaskos

Department of Physics @ Aristotle University of Thessaloniki , Greece
PGP key : http://tassadar.physics.auth.gr/~dzila/pgp_public_key.asc
  http://egnatia.ee.auth.gr/~dzila/pgp_public_key.asc
MD5sum  : de2bd8f73d545f0e4caf3096894ad83f  pgp_public_key.asc


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


random hangs/reboots with Dell servers

2007-04-19 Thread Dimitris Zilaskos


Dear all,

I am trying to understand some long standing issues we have with freebsd 
and Dell servers.


Over the last 3 year we have installed freebsd 5.x and 6.x, with currently 
deployed version being 6.1, to a variety of of Dell rack mounted systems.


The Dell systems used so far are Poweredge 1750, 2950 (both scsi), and 
sc1425 (sata). All of them are dual CPU Xeon systems.


All these systems serve as mail/web servers, with 2 to 15 jails.

Installation has always proceeded normally without problems. However, 
after a few months of operation, all of these systems, purchased at 
different moments during the last 3 years, will begin rebooting randomly 
or freezing completely.


These reboots/freezes will at first occur once per 6 months, then 
gradually will move to to once per month, to normally stabilize around 
once per week, but in the case of the 1750 system once it even happened 
twice a day.


Load does not seem to matter, since even after shutting down all services 
in the servers, still random reboots occured.


So far we tried various tricks digged from the archives, like disabling 
ACPI, HT, but nothing changed.


We have migrated some systems that had these issues to RHEL compatible OS, 
and they run rock solid under heavy load.


Right now I have enabled kernel crash dumps and I am waiting for the next 
crash. But I understad a lot of people use FreeBSD with Dell servers, and 
I would like to listen on how to tackle this situation we are facing.


Best regards,

--


Dimitris Zilaskos

Department of Physics @ Aristotle University of Thessaloniki , Greece
PGP key : http://tassadar.physics.auth.gr/~dzila/pgp_public_key.asc
  http://egnatia.ee.auth.gr/~dzila/pgp_public_key.asc
MD5sum  : de2bd8f73d545f0e4caf3096894ad83f  pgp_public_key.asc

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: (da0:ahc0:0:0:0): Unexpected busfree in Data-in phase

2002-12-21 Thread Dimitris Zilaskos

 I started getting the following message about two days ago:

 (da0:ahc0:0:0:0): Unexpected busfree in Data-in phase

 What does it mean? Should I worry about it? Relevant part of dmesg on a

 Wherever I encountered it , that message either meant bad cabling
/termination , or insufficient power output from the psu to support all
the hard disks on the system . Generally , it indicates a hardware problem
. Is the message the only symptom ? Can you access the filesystem on
the disk normally ?


 Kind regards ,
--
=

Dimitris Zilaskos

Department of Physics @ Aristotle Univercity of Thessaloniki , Greece
PGP key : http://tassadar.physics.auth.gr/~dzila/pgp_public_key.asc
  http://egnatia.ee.auth.gr/~dzila/pgp_public_key.asc
MD5sum  : 4f84f3f53cb046008b4abcb2a092d28d  pgp_public_key.asc
=



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-questions in the body of the message