On 05/23/2015 06:53 PM, Joseph wrote:
On 05/23/15 18:08, Zhu Sha Zang wrote:
On 05/23/2015 05:24 PM, Joseph wrote:
I have a box in a remote location (8-core CPU) and it turn itself off
during compiling
The box it connected to UPS. Is it power supply?
Maybe. I have a problem like that when using high processing simulation
with nvidia-cuda and the power supply protection was unable to keep a
safe energy level then the system goes off.
But, if the failure happens during compilation time can be a heat
problem. Install lm_sensors and use something like that: "watch -n 1
sensors".
If not, if the temperature stay at safe levels, maybe you have a RAM
corruption. In this case, you'll need to use memtest86++ to check.
Good Luck
I tried to read the lm-sensors again and the compupter turn crash with
the readings:
fan1: 0 RPM (min = 10 RPM) ALARM
fan2: 0 RPM (min = 0 RPM)
fan3: 0 RPM (min = 0 RPM)
fan5: 0 RPM (min = 0 RPM)
temp1: +47.0°C (low = +127.0°C, high = +127.0°C) sensor =
thermistor
temp2: +106.0°C (low = +127.0°C, high = +70.0°C) sensor =
thermal diode
temp3: +106.0°C (low = +127.0°C, high = +127.0°C) sensor =
thermistor
cpu0_vid: +1.250 V
I'm suspecting it is power supply.
Hey, did you run "sensors-detect" and "/etc/init.d/lm_sensors" as root
before use "sensors"?
As was said, maybe you're using wrong kernel modules.
Regards