Hi,
No, I haven't. Personally, I think the idea of burning out my fans to fix a
poor design really sucks. I'm going to try the "Update BIOS/NIC" first, see
how that does for me. If not, we've been setting via racadm
if len(thermal_profile) and thermal_profile != 'maximum performance':
set_value('system.thermalsettings.ThermalProfile', '1')
if len(base_algorithm) and base_algorithm != 'maximum performance':
set_value('system.thermalsettings.BaseAlgorithm', '1')
if len(fan_speed_offset) and fan_speed_offset != 'low fan speed':
set_value('system.thermalsettings.FanSpeedOffset', '0')
You're saying that according to
https://en.community.dell.com/techcenter/extras/m/white_papers/20441060/download
set system.thermalsettings.FanSpeedOffset to 2, then if that doesn't work
1, and as a last resort (And don't want to be able to hear anything in the
datacenter besides fans) to 3?
Thanks,
Tuc
On Fri, Jan 20, 2017 at 4:18 PM, Russell Kackley <[email protected]> wrote:
> Hi Tuc,
>
> We had what sounds like a similar problem to yours. Fortunately, the
> change to the fan speed offset solved the problem for us. Here are the
> details:
>
> We have 3 PE R720 servers that were purchased in mid-2013. All of them
> have Intel X540/2P I350 rNDC 10 Gb/s NIC's in them. I'm guessing that is
> similar to what you have in your R720XD. Two of the servers have worked
> flawlessly for the past three years. However, one of them, even from the
> early days of use, would intermittently report the following error:
>
> The system board NDC PG voltage is outside of range.
>
> That would cause a reboot event, which was obviously a serious problem.
> The server was under warranty and the technician tried replaced both the
> NIC and the motherboard. We still got the same error and reboot problem.
> Eventually, the issue got elevated to a L3 technician at Dell and they
> advised us to set the "Fan speed offset" to the "Low Fan Speed Offset""
> setting. We made that change to the problem server and its performance has
> been perfectly fine since then. I'm guessing that the fan speed change
> solved the NIC overheating problem.
>
> I'm sorry to hear that it doesn't seem to have solved your problem. I
> think that the iDRAC Settings-Thermal GUI offers the "High Fan Speed
> Offset", which runs the fans faster than the "Low Fan Speed Offset"
> setting. Did you try the "High Fan Speed Offset" setting to see if that
> corrects the problem?
>
> On Fri, Jan 20, 2017 at 6:47 AM, Tuc at Beach House <[email protected]>
> wrote:
>
>> Hi,
>>
>> We have an older R720XD (Ship date: September 07, 2012) that has X540-AT2
>> NICs (10G). The system seems to shut down the NICS for overheating.
>> Apparently, Dell told them just to change the "Thermal Profile" to Max, and
>> the "Fan Speed Offset" to low. That worked for a while, but now its
>> happening again. The unit isn't under warranty, so I can't call Dell
>> anymore.
>>
>> Has anyone else found a way around this. Its a hadoop node that just
>> "disappears" on us and causes problems.
>>
>> Thanks, Tuc
>>
>> _______________________________________________
>> Linux-PowerEdge mailing list
>> [email protected]
>> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
>>
>>
>
>
> --
> Russell Kackley
> Subaru Telescope
> Hilo, Hawaii
>
>
_______________________________________________
Linux-PowerEdge mailing list
[email protected]
https://lists.us.dell.com/mailman/listinfo/linux-poweredge