Stroller writes:
> On 21 Aug 2010, at 14:25, Alex Schuster wrote:
> > ...
> > I want to monitor the power status of my hard drives, so I wrote a
> > little
> > script that gives me this output:
> >
> > sda: standby
> > sdb: standby
> > sdc: active/idle 32°C
> > sdd: active/idle 37°C
> >
> > This script is called every minute via an fcron entry, output goes
> > into a log file, and I use the file monitor plasmoid to watch this log
> > file in KDE.
> >
> > It's working fine, but also monitor my syslog in another file
> > monitor plamoid, and now I get lots of these entries:
> >
> > Aug 21 14:21:06 [fcron] pam_unix(fcron:session): session opened for
> > user root by (uid=0)
> > Aug 21 14:21:06 [fcron] Job /usr/local/sbin/hdstate >> /var/log/
> > hdstate started for user root (pid 24483)
> > Aug 21 14:21:08 [fcron] Job /usr/local/sbin/hdstate >> /var/log/
> > hdstate completed
> > Aug 21 14:21:08 [fcron] pam_unix(fcron:session): session closed for
> > user root
>
> #!/bin/bash
> while true
> do
> for drive in a b c d
> do
> /usr/sbin/smartctl /dev/sd$drive --whatever >> /var/log/hdstate
> done
> sleep 60
> done
I use hdparm and hddtemp:
for hd in sda sdb sdc sdd
do
str=$( /sbin/hdparm -C /dev/$hd )
state=${str##*is: }
if [[ $state == active/idle ]] && [[ $hd =~ sd[c] ]]
then
temp=$( /usr/sbin/hddtemp -q /dev/$hd )
temp=${temp% or *}
temp=${temp##* }
else
temp=
fi
echo "$hd: $state $temp"
done
Unfortunately, reading the temperature makes a drive in standby spin up,
and prevents automatic spindown after a while of idle time. So now I ask
for the temperature only on my system drive, the others should sleep most
of the time anyway.
> I would personally update more often than this, and my concern would
> be that if the process fails then your plasmoid isn't showing the
> correct data.
>
> I presume this is the same with your current setup: if cron dies then
> the current temperature will not be read to file, and the plasmoid
> will continue reading the last lines in /var/log/hdstate - the drive
> can overheat without you knowing about it.
Nah, it's really not that important for me. I show the temperature just
for the fun of it, and for extreme temperatures I have smartd running, see
below.
I'm more interested in the active/standby state. I just added two old
additonal IDE drives for additional backups, and I want them to be silent
most of the time. So I wrote a little script to show the status so I see
when they spin up again (and they do this sometimes), and used fcron to
get the data into a log file that the plasmoids shows.
The problem with cron is that I get those cron logs I do not like, and
that the update time of 60 seconds is a little long. Running the script in
a loop, started in .kde4/Autostart, would be better, but as a user I have
no permission to call hdparm or hdtemp. I do not want to be part of the
disk group, and when using sudo I would get the logs by sudo I wanted to
avoid. So now I SUID'ed hdparm and hddtemp, changed the group to wheel and
disabled execution for others. cron problem not solved, but workarounded.
> So I would expect there to be a better "plasmid" for this task. I'm
> completely unfamiliar with plasmids, but what you really want is a
> plasmid that itself runs a script and displays the stdout on your
> screen. That way if there's no data, or an error, then _you see that
> in the plasmid_, instead of silently ignoring it (as you may be at
> present).
>
> The easiest (but dumb) way to handle this is to add the date to your
> plasmid's display so that at least you can see that something's wrong
> if it doesn't match the clock. A better way is not to have to watch a
> status monitor at all, and just have a script running that emails you
> if the temperature is above a specified range.
I have smartd running, which should send me mails about such things. For
each drive, I have a line like this in /etc/smartd.conf:
/dev/sdc -a -n standby -o on -S on -W 5,40,45 \
-s (S/../.././12|L/../../06/06) -m [email protected]
This does some regular health checks on the drive, when it is not in
standby mode. Temperature changes of more than 5 degrees and temperatures
of 40 degrees or more are logged. I will receive an email when the
temperature reaches 45 degrees, or when it reaches a new maximum. The
maximum values are preserved across boot cycles (option -S). Every day at
12:00, a short self test is scheduled, and a long self test each sunday on
06:00.
Wonko