[EXTERNAL EMAIL]
Onno, I'm not sure how Dell allows you to configure the server at order
time, but Dell often have configuration limitations that do not seem
immediately obvious. As a concrete example, if one tries to configure a
server with too much RAM or too many disks you may receive a warning
that you have to upgrade the PSU in the server to complete your order.
I could imagine that, as another possible case, Dell may allow you to
install rear disks but it's possible they limit these disks to 7.2k (or
disks rated to run at higher temperature, draw less power, etc). If you
are installing disks yourself, you may not be aware of some of these
limitations (as they may not be documented clearly or publicly). Every
disk manufacturer sets their own temperature thresholds so while one
drive may support 60 C, another may top out at 50 C. This isn't to say
that either disk will be reliable if kept under those temperatures, just
that those are the manufacturer's recommended operating temperatures and
that they vary from model to model. If you're having high failure rates
on rear disks, and the only obvious difference between front and rear
disks is the operating temperature, I think that's a strong indicator
that temperature could be a factor. Going forward you might consider
using SSDs (which often produce less heat), lower rpm disks (that
produce less heat), or disks rated for higher temperature extremes to
see if there is a reliability improvement.
Onno Zweers wrote on 10/11/2019 2:43 AM:
Following up.
I checked two classes of servers:
R730xd - rear disk 58°C
R740xd2 - rear disk 33°C
That's a huge difference. The fan speeds were similar, between 10,300 and
11,160 rpm. I don't think this accounts for the difference in temperature.
Perhaps the airflow of the system has been improved in the R740. But there is
one significant difference: in the R740xd2, the rear disks are SSDs, where the
R730xd have spinning disks.
Cheers,
Onno
Op 10 okt. 2019, om 20:19 heeft Onno Zweers <[email protected]> het
volgende geschreven:
Thanks everyone for the very useful answers. I had a quick look:
[root@shark5 ~]# for disk in $(smartctl --scan | egrep -o megaraid,[0-9]+) ; do echo -n
"$disk - " ; smartctl -a /dev/sdb -d $disk | grep 'Current Drive Temperature' ;
done
megaraid,0 - Current Drive Temperature: 31 C
megaraid,1 - Current Drive Temperature: 32 C
megaraid,2 - Current Drive Temperature: 32 C
megaraid,3 - Current Drive Temperature: 31 C
megaraid,4 - Current Drive Temperature: 32 C
megaraid,5 - Current Drive Temperature: 30 C
megaraid,6 - Current Drive Temperature: 32 C
megaraid,7 - Current Drive Temperature: 32 C
megaraid,8 - Current Drive Temperature: 31 C
megaraid,9 - Current Drive Temperature: 32 C
megaraid,10 - Current Drive Temperature: 34 C
megaraid,11 - Current Drive Temperature: 32 C
megaraid,12 - Current Drive Temperature: 56 C
megaraid,13 - Current Drive Temperature: 58 C
megaraid,14 - Current Drive Temperature: 44 C
megaraid,15 - Current Drive Temperature: 45 C
megaraid,16 - Current Drive Temperature: 47 C
megaraid,17 - Current Drive Temperature: 51 C
58 degrees C seems very hot to me, and indeed disks 12 and 13 are in the back
of the machine. We have lots of these servers and we've noticed that these rear
disks fail rather often. The 2 disks in the rear have as many failures as the
12 disks in front. I guess the next step would be to check at which speed the
fans blowing.
Cheers,
Onno
_______________________________________________
Linux-PowerEdge mailing list
[email protected]
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
_______________________________________________
Linux-PowerEdge mailing list
[email protected]
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
_______________________________________________
Linux-PowerEdge mailing list
[email protected]
https://lists.us.dell.com/mailman/listinfo/linux-poweredge