On Thu, 2 Jul 2020 at 01:34, Samuel Sieb <sam...@sieb.net> wrote:

> On 7/1/20 8:01 PM, Thomas Dineen wrote:
> >     I am developing an optimization application that uses the CPU quite
> > intensely
> > for long periods of time on a CentOS  6.9 machine. Some test runs can
> > run for hours or even days.
> > On a particular test the the OS crashes to a black screen with the
> > message "North Bridge Disconnected"
> > printed in the upper left.
>
> This list is not for CentOS questions and why are you still on 6.9,
> which is quite old?
>
> >     The application is carefully designed to not simply consume too much
> > memory and run out, and observation
> > of the System Monitor confirms this.
> >
> >     What could be  the cause? Is this a bug in my code, quite possible
> > given the fact that I an still testing and debugging?
> > Can a bug in a user application get into Kernel space and crash the
> > machine?
> > Or is it likely a hardware problem on the Motherboard?
>
> Unless you're running as root and poking at things you shouldn't,
> there's no way a user level program should be able to cause that.  It
> sounds most likely to be a hardware issue.  Maybe something's overheating?
>

Without any additional details, overheating is the prime suspect.   I would
open up the system
to check for bulging capacitors and anything that might affect cooling.

You don't mention the hardware details (Intel versus AMD) or whether the
system runs in a controlled environment (machine room).
https://www.intel.com/content/dam/doc/design-guide/e8500-chipset-north-bridge-external-memory-bridge-guide.pdf
<https://www.intel.com/content/dam/doc/design-guide/e8500-chipset-north-bridge-external-memory-bridge-guide.pdf#:~:text=The%20Intel%C2%AEE8500%20Chipset%20North%20Bridge%20%28NB%29%20and%20eXternal,specifications.%20Current%20characterized%20errata%20are%20available%20on%20request.>
describes
the cooling requirements for North Bridge on one
Intel chipset.  There can be significant differences in how well cooling is
engineered between homebrew and
high-end server class systems, and cooling deteriorates over time due to
things like dust accumulating on cooling fins and fan blades and cables
blocking air pathways.   There are apps that monitor temperatures at a
number of sensors on a system board.


-- 
George N. White III
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org

Reply via email to