[SOLVED] Re: smartd
From: Andy Smith Date: Sat, 22 Jan 2022 19:07:23 + > If the drive is currently not in use then it may be simpler to just > write over the entire drive with a simple > # dd if=/dev/zero of=/dev/sda When convenient will get another drive and substitute in the machine. Then the dodgy drive can be written over and I can decide whether to scrap it. Meanwhile I have a nice laptop with Debian 11.1 installed. If the desktop system fails I move the SD card to the laptop and carry on work as if nothing happened. The desktop system can be resurrected when convenient. > I hope none, because you use RAID. Being ignorant about RAID I had to read here. https://en.wikipedia.org/wiki/RAID No doubt invaluable for a server with a large quantity of dynamic data. For my work the SD card and spare machine seem adequate. When the desktop system crashes I can lose a few hours of editing or an emessage. It's tolerable. Thanks,... P. -- mobile: +1 778 951 5147 VoIP: +1 604 670 0140 48.7693 N 123.3053 W
[SOLVED] Re: smartd
From: Dan Ritter Date: Sat, 22 Jan 2022 13:41:17 -0500 > Then, you have a choice: if the number doesn't increase over, > say, the next week, it's just a bad patch. That's the case. The number isn't increasing. > You should do a backup ASAP. Backup function described here. https://lists.debian.org/debian-user/2022/01/msg00863.html Thanks for the information about the smartd. ... P. -- mobile: +1 778 951 5147 VoIP: +1 604 670 0140 48.7693 N 123.3053 W
Re: smartd
On Du, 23 ian 22, 19:09:48, Linux-Fan wrote: > pe...@easthope.ca writes: > > > > I knew nothing of RAID. Therefore read here. > > https://en.wikipedia.org/wiki/RAID > > > > Reliability is more valuable to me than speed. RAID 0 won't help. > > For reliability I need a mirrored 2nd drive in the host; RAID 1 or > > higher. > > > > Google of "site:wiki.debian.org raid" returned ten pages, each quite > > specialized and jargonified. A few tips to establish mirroring can > > help. > > Here, it returns a few results, too. I think the most straight-forward is > this one: > > https://wiki.debian.org/SoftwareRAID > > For most purposes, I recommend RAID1. If you have four HDDs of identical > size, RAID10 might be tempting, too, but I'd still consider and possibly > prefer just creating two independent RAID1 arrays. > > If you want to configure it from the installer, these step-by-step > instructions show all the relevant installer screens: > > https://sleeplessbeastie.eu/2013/10/04/how-to-configure-software-raid1-during-installation-process/ > > Also, keep in mind that establishing the mirroring is not all you need to > do. To really profit from the enhanced reliability, you need to play through > the recovery scenario, too. I recommend doing this in a VM unless you have > some dedicated machine with at least two HDDs to play with. Another thing to consider is that Linux Software RAID (also known as "md" or "mdadm" RAID) by itself doesn't have any integrity checking. In case one of the drives returns bad data[1] it may end up overwriting the good data on the other drives[2][3]. It's possible to add an integrity checking layer, but in my opinion at that point the whole setup becomes so complex one might as well be using btrfs or ZFS instead. Both have built in integrity checking and can recover the data provided there is at least one good copy available[4], in addition to the many other features they bring (logical volume management, snapshots, copy-on-write, etc.). For the avoidance of doubt, neither is a replacement for backups[5]. [1] Cosmic rays flipped a bit, bad drive, bad cable, bad controller, etc. [2] https://unixsheikh.com/articles/battle-testing-zfs-btrfs-and-mdadm-dm.html [3] A RAID 1 can have more than just 2 drives, it's just uncommon in home setups because of cost reasons. [4] It's possible to use both btrfs and ZFS without redundancy. They will be able to tell your data is corrupted, but won't be able to recover it, of course. [5] http://taobackup.com/ Kind regards, Andrei -- http://wiki.debian.org/FAQsFromDebianUser signature.asc Description: PGP signature
Re: smartd
On Sunday, January 23, 2022 10:57:53 AM to...@tuxteam.de wrote: > On Sun, Jan 23, 2022 at 08:14:06AM -0700, Charles Curley wrote: > > On Sun, 23 Jan 2022 11:09:47 + > > > > Andy Smith wrote: > > > Yes. When a drive sector goes bad, the drive cannot read from it, so > > > you get an error in Linux when a read is attempted. > > > > As I understand things, that isn't entirely correct. From what I > > understand: > > > > If the drive can read a sector without error, it passes the data to the > > OS and that's it. > > > > If it gets an error, it uses cyclical redundancy check (CRC) data to > > reconstruct the data. If that fails, it reports an error to the OS. If > > the CRC reconstruction is successful, the drive re-writes the sector > > and passes the reconstructed data back to the OS. > > It is actually more complicated as this. As I understand this Wikipedia > entry [1], some errors while reading a block are to be expected: it > seems to be more profitable to push the density to the limit where error > correction picks up some rest. Only when the error rate surpasses some > threshold the block is remapped. > > I guess SMART counts the latter events, but actually I have no idea :) > > And the error correction codes are a bit more sophisticated than plain > CRC: Reed-Solomon or, more modern, low-density parity-check codes. I would guess that the actual details vary depending on the manufacturer and the revision level of the manufacturers firmware on the drive.
Re: smartd
On Sun, 23 Jan 2022 18:41:36 + Andy Smith wrote: > If wanting to play around with mdraid you can do it with loop > devices created from image files on your regular filesystem. Nice, thank you. One would probably have to install mdadm: # apt install mdadm > for i in a b; do sudo losetup -f fake_disk_{$i}.img; done Typo: fake_disk_${i}.img -- Does anybody read signatures any more? https://charlescurley.com https://charlescurley.com/blog/
Re: smartd
Hello, On Sun, Jan 23, 2022 at 07:09:48PM +0100, Linux-Fan wrote: > To really profit from the enhanced reliability, you need to play > through the recovery scenario, too. I recommend doing this in a VM > unless you have some dedicated machine with at least two HDDs to > play with. If wanting to play around with mdraid you can do it with loop devices created from image files on your regular filesystem. $ cd /var/tmp $ for i in a b; do fallocate -l 100M fake_disk_${i}.img; done $ for i in a b; do sudo losetup -f fake_disk_{$i}.img; done $ sudo mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/loop[01] $ sudo mkfs.ext4 /dev/md0 $ sudo mount /dev/md0 /mnt You can then practice removing, adding, failing etc. the loop devices. When done playing around just unmount, stop array, losetup -d each loop device then delete the files. Cheers, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting
Re: smartd
pe...@easthope.ca writes: From: Andy Smith Date: Sat, 22 Jan 2022 19:07:23 + > ... you use RAID. I knew nothing of RAID. Therefore read here. https://en.wikipedia.org/wiki/RAID Reliability is more valuable to me than speed. RAID 0 won't help. For reliability I need a mirrored 2nd drive in the host; RAID 1 or higher. Google of "site:wiki.debian.org raid" returned ten pages, each quite specialized and jargonified. A few tips to establish mirroring can help. Here, it returns a few results, too. I think the most straight-forward is this one: https://wiki.debian.org/SoftwareRAID For most purposes, I recommend RAID1. If you have four HDDs of identical size, RAID10 might be tempting, too, but I'd still consider and possibly prefer just creating two independent RAID1 arrays. If you want to configure it from the installer, these step-by-step instructions show all the relevant installer screens: https://sleeplessbeastie.eu/2013/10/04/how-to-configure-software-raid1-during-installation-process/ Also, keep in mind that establishing the mirroring is not all you need to do. To really profit from the enhanced reliability, you need to play through the recovery scenario, too. I recommend doing this in a VM unless you have some dedicated machine with at least two HDDs to play with. [...] HTH and YMMV Linux-Fan öö pgp1HGhRHorQN.pgp Description: PGP signature
Re: smartd
From: Andy Smith Date: Sat, 22 Jan 2022 19:07:23 + > ... you use RAID. I knew nothing of RAID. Therefore read here. https://en.wikipedia.org/wiki/RAID Reliability is more valuable to me than speed. RAID 0 won't help. For reliability I need a mirrored 2nd drive in the host; RAID 1 or higher. Google of "site:wiki.debian.org raid" returned ten pages, each quite specialized and jargonified. A few tips to establish mirroring can help. > If this drive is in use already then you possibly want to know which > files are affected by these bad sectors. I hope none, because you > use RAID. But if you need to know, I have done that before and can > dig out the scripts⦠Seems more efficient to establish good reliability. Then, if a drive fails, recycle and replace. Thanks,... P. -- mobile: +1 778 951 5147 VoIP: +1 604 670 0140 48.7693 N 123.3053 W
Re: smartd
On Sun, Jan 23, 2022 at 08:14:06AM -0700, Charles Curley wrote: > On Sun, 23 Jan 2022 11:09:47 + > Andy Smith wrote: > > > Yes. When a drive sector goes bad, the drive cannot read from it, so > > you get an error in Linux when a read is attempted. > > As I understand things, that isn't entirely correct. From what I > understand: > > If the drive can read a sector without error, it passes the data to the > OS and that's it. > > If it gets an error, it uses cyclical redundancy check (CRC) data to > reconstruct the data. If that fails, it reports an error to the OS. If > the CRC reconstruction is successful, the drive re-writes the sector > and passes the reconstructed data back to the OS. It is actually more complicated as this. As I understand this Wikipedia entry [1], some errors while reading a block are to be expected: it seems to be more profitable to push the density to the limit where error correction picks up some rest. Only when the error rate surpasses some threshold the block is remapped. I guess SMART counts the latter events, but actually I have no idea :) And the error correction codes are a bit more sophisticated than plain CRC: Reed-Solomon or, more modern, low-density parity-check codes. Cheers -- tomás signature.asc Description: PGP signature
Re: smartd
> You should do a backup ASAP. Personal data is on a micro SD card. After doing something worth saving it's backed to the host drive by me running this bash script . Backup() { \ if [ "$#" -gt 1 ]; then echo "Too many arguments."; else echo "0 or 1 arguments are OK."; if [ "$#" -eq 0 ]; then echo "0 arguments is OK."; destination=~/MY0.Bak; echo "destination is $destination."; else echo "1 argument is OK."; destination=~/MY1.Bak; echo "destination is $destination."; fi; echo "Executing rsync."; rsync \ -auv /home/peter/MY/* $destination ; /bin/ls -ld ~/MY/MailMessages; printf "du -s $destination gives "; du -s $destination; fi; } "ls ... MailMessages" just reminds me to clean the mailbox. The SD is used in multiple machines at two sites. So my data is fairly well protected. If the SD fails, an inverse script restores data from a host drive, to a new SD. If a meteorite goes through a machine, I get the holes in the case welded up and replace destroyed internals. If the drive is replaced, I reinstall and configure the system. A nuisance but not a catastrophe. If an asteroid or meteorite shower or volcano destroys the SD card and all machines where it's backed, and I survive, data probably won't be a high priority but I can look for an old backup DVD. Thx,... P. -- mobile: +1 778 951 5147 VoIP: +1 604 670 0140 48.7693 N 123.3053 W
Re: smartd
On Sun, 23 Jan 2022 11:09:47 + Andy Smith wrote: > Yes. When a drive sector goes bad, the drive cannot read from it, so > you get an error in Linux when a read is attempted. As I understand things, that isn't entirely correct. From what I understand: If the drive can read a sector without error, it passes the data to the OS and that's it. If it gets an error, it uses cyclical redundancy check (CRC) data to reconstruct the data. If that fails, it reports an error to the OS. If the CRC reconstruction is successful, the drive re-writes the sector and passes the reconstructed data back to the OS. If the attempt to re-write the sector fails, the drive allocates a spare sector, writes that, and notes the mapping in it sector reallocation table. There may be multiple efforts to re-write a sector, either in place or reallocated. And there's always the possibility that the sector reallocation table will go bad. -- Does anybody read signatures any more? https://charlescurley.com https://charlescurley.com/blog/
Re: smartd
Hello, On Sat, Jan 22, 2022 at 09:16:53PM -0800, pe...@easthope.ca wrote: > From: Andy Smith > Date: Sat, 22 Jan 2022 19:07:23 + > > You are better off finding the damaged sectors and causing the drive > > to remap them by writing new content in there. Then you don't have > > to keep track yourself of which sections of the disk are unusable. > > I don't understand how bad sectors are "remapped". The process is > internal to the drive? Yes. When a drive sector goes bad, the drive cannot read from it, so you get an error in Linux when a read is attempted. But if you are *writing* to it, if a modern drive can't do the write it just writes the data to a spare sector and remaps that sector location to the location of the formerly spare one. The operating system is unaware that this has happened, though it is recorded in SMART attributes (the reallocated sector count). So overwriting bad sectors will make the problem go away until there are no more spare sectors. > Depends on Linux software? No, anything that can write to the drive will work, which is why I suggested dd over the whole drive if you aren't currently using it. hdparm makes it easy to write a specific sector but it's also possible with dd and its "skip" and "count" arguments. If you are careful. > What about connecting the drive to another system and applying > fsck to each part? What would be the goal? A SMART long self-test should tell you which bits are unreadable. > > Consumer HDDs usually have a few hundred spare sectors for > > remapping. > > What happens when all spare sectors are allocated? The next time a sector goes bad it would not be fixable by writing to it and there would be a part of the drive that is permanently unusable. In the old days the "badblocks" tool would be used to find these areas and avoid their use. These days we let drives remap bad areas and replace either pro-actively or when they can't remap any more. Drives often encounter severe problems before they get as far as using all their spare sectors. They send so many errors up to Linux that Linux disconnects the whole device. > Any indication to prevent silent loss of data? When a sector goes bad, whatever data that was in there is now lost. Since you cannot prevent drives from failing, appropriate countermeasures include: - Introducing redundancy with RAID or filesystems that have it built in, like btrfs or zfs - Having good backups Both are generally considered a good idea. With redundancy no data would be lost and a tedious recovery process involving your backups is turned into a more mundane process of replacing a failed drive. You also need to monitor both of those to make sure they are functioning properly. Cheers, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting
Re: smartd
From: Andy Smith Date: Sat, 22 Jan 2022 19:07:23 + > > Two parts are available to mount /root; /root can be on /dev/sda1 or > > /dev/sda2. > > I don't understand what you mean by this statement. I should have referred to / rather than /root. peter@joule:/home/peter$ lsblk --list | grep '\(N\|sda\)' NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda8:00 149.1G 0 disk sda1 8:10 7G 0 part sda2 8:20 7G 0 part / sda3 8:30 8G 0 part [SWAP] sda4 8:40 127G 0 part /home Currently / is in sda2 and sda1 is not used. If the faulty media is strictly in sda2, it can be avoided by shifting / to sda1. > You are better off finding the damaged sectors and causing the drive > to remap them by writing new content in there. Then you don't have > to keep track yourself of which sections of the disk are unusable. I don't understand how bad sectors are "remapped". The process is internal to the drive? Depends on Linux software? What about connecting the drive to another system and applying fsck to each part? Then decide whether to scrap the drive. > Consumer HDDs usually have a few hundred spare sectors for > remapping. What happens when all spare sectors are allocated? Any indication to prevent silent loss of data? Thanks,... P. -- mobile: +1 778 951 5147 VoIP: +1 604 670 0140 48.7693 N 123.3053 W
Re: smartd
On Sat, 22 Jan 2022 09:18:27 -0800 pe...@easthope.ca wrote: > > Jan 22 08:49:17 joule smartd[563]: Device: /dev/sda [SAT], 155 Currently > unreadable (pending) sectors > Jan 22 08:49:17 joule smartd[563]: Sending warning via > /usr/share/smartmontools/smartd-runner to root ... > Jan 22 08:49:18 joule smartd[563]: Warning via > /usr/share/smartmontools/smartd-runner to root: successful > Jan 22 08:49:18 joule smartd[563]: Device: /dev/sda [SAT], 132 > Offline uncorrectable sectors Unless you have a supply of replacement hard drives handy, I'd order a new one now, then worry about the details of this one. My recent experience with hard drives and lead times due to shipping times is not encouraging. -- Does anybody read signatures any more? https://charlescurley.com https://charlescurley.com/blog/
Re: smartd
Hello, On Sat, Jan 22, 2022 at 09:18:27AM -0800, pe...@easthope.ca wrote: > smartd reports to syslog. > > Jan 22 08:49:17 joule smartd[563]: Device: /dev/sda [SAT], 155 Currently > unreadable (pending) sectors > Jan 22 08:49:17 joule smartd[563]: Sending warning via > /usr/share/smartmontools/smartd-runner to root ... > Jan 22 08:49:18 joule smartd[563]: Warning via > /usr/share/smartmontools/smartd-runner to root: successful > Jan 22 08:49:18 joule smartd[563]: Device: /dev/sda [SAT], 132 Offline > uncorrectable sectors > > Two parts are available to mount /root; /root can be on /dev/sda1 or > /dev/sda2. I don't understand what you mean by this statement. Either the disk is already partitioned and / (you did mean "/", right, not "/root"?) is on a known partition, or the disk isn't yet partitioned and / can be on any partition you set it to be on. > If the errors are clustered, the bad area might be avoided easily > in partitioning. You are better off finding the damaged sectors and causing the drive to remap them by writing new content in there. Then you don't have to keep track yourself of which sections of the disk are unusable. > Feasible? Can the locations of the errors be found? Sure. Usually. If the drive is currently not in use then it may be simpler to just write over the entire drive with a simple # dd if=/dev/zero of=/dev/sda That should force a remap of any damaged sectors. If you need to preserve what's currently on the drive then you can use a SMART long self-test to try reading the whole drive. It should report which LBA (sector) it got to when the test failed. To start the test: # smartctl -t long /dev/sda To see the status of the test: # smartctl -l selftest /dev/sda You can instead do a "selective" test, to only test between certain sector numbers. Once you know the sector number you can verify that there's issues by trying to read it with hdparm: # hdparm --read-sector 9519790 /dev/sda If that sector is truly damaged then this will show an error and complaints in syslog. You can force that sector to be written over with zeros, obviously losing anything that was in it, again with hdparm: # hdparm --yes-i-know-what-i-am-doing --write-sector 9519790 /dev/sda This should force a remap and will complete successfully. If it doesn't then the drive might be out of spare sectors, or is more severely damaged, and it's done for. If this drive is in use already then you possibly want to know which files are affected by these bad sectors. I hope none, because you use RAID. But if you need to know, I have done that before and can dig out the scripts… > Better to replace the drive? Consumer HDDs usually have a few hundred spare sectors for remapping. If I have a less important machine with a couple of bad sectors I'll often be willing to force a remap like this. Seeing 155 bad sectors in a SMART report would worry me for any machine. But it's your call. Cheers, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting
Re: smartd
pe...@easthope.ca wrote: > smartd reports to syslog. > > Jan 22 08:49:17 joule smartd[563]: Device: /dev/sda [SAT], 155 Currently > unreadable (pending) sectors > Jan 22 08:49:17 joule smartd[563]: Sending warning via > /usr/share/smartmontools/smartd-runner to root ... > Jan 22 08:49:18 joule smartd[563]: Warning via > /usr/share/smartmontools/smartd-runner to root: successful > Jan 22 08:49:18 joule smartd[563]: Device: /dev/sda [SAT], 132 Offline > uncorrectable sectors > > Two parts are available to mount /root; /root can be on /dev/sda1 or > /dev/sda2. /home is used minimally. If the errors are clustered, the > bad area might be avoided easily in partitioning. > > Feasible? Can the locations of the errors be found? Better to > replace the drive? Offline uncorrectable is bad. You should do a backup ASAP. Then, you have a choice: if the number doesn't increase over, say, the next week, it's just a bad patch. If it does increase, the drive is bad and needs to be replaced. -dsr-
Re: smartd monitoritzant els vostres discs durs...
Em sorprèn aquesta afirmació. Jo passo els processos llargs cada mes i sempre tinc els discs actius. (hdparm -B 255). Tinc algo malentès? Toni Mas GPG 3F42A21D84D7E950 Sent with ProtonMail Secure Email. ‐‐‐ Original Message ‐‐‐ En dilluns 8 de març de 2021 a les 10:55, Orestes Mas va escriure: > El 5 de març de 2021 20:30:49 CET, joanarboc...@calbasi.net ha escrit: > > > En aquest howto: > > https://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu > > he vist que e spot fer que smartd estigui funcionant sempre com a > > dimoni, i a part de passar info al syslog, també envii un mail si hi ha > > algun problema. > > Volia saber si valtros useu aquesta opció i si te alguna > > contrapartida... > > D'altra banda, veig quan quan es fa un anàlisi llarg, pot trigar molta > > estona. Per exemple: sudo smartctl -t long /dev/sdb > > i per tant suposo que no és aquest tipus d'anàlisi els que fa el dimoni > > smartd, oi? > > Fins ara, > > Fins on jo sé aquesta mena de tests (tant els curts com els llargs) fan les > comprovacions només quan el disc està inactiu (idle). Per tant, no hauria > d'afectar a l'operativa normal. > > Un altre tema és quan considera el disc que està "idle". Si ho fa massa aviat > després de la darrera lectura/escriptura de l'usuari, aleshores sí que podria > baixar el rendiment perquè aniria intercalant el "test" amb les operacions > normals. En canvi, si s'assegura que l'usuari no està fent realment res abans > de continuar amb el "test", probablement l'impacte en el rendiment serà petit. > > Orestes > > > --- > > Enviat des del meu dispositiu Android amb el K-9 Mail. Disculpeu la brevetat. signature.asc Description: OpenPGP digital signature
Re: smartd monitoritzant els vostres discs durs...
Hola Joan > Quan dius que no fèieu tests perquè primer esperàveu a una > alerta, vols dir que smartd fa unes comprovacions DIFERENTS als > testos? I son aquestes les que posen sobre avís? Comprova els valors d'SMART (varien segons el tipus de disc), el registre d'errors del disc i també el resultat dels tests que s'hagin executat. Però és molt configurable, per exemple pots programar-lo perquè t'avisi si el disc supera una temperatura determinada. Un error típic que rebíem sovint als discs del nostre clúster de ceph eren els canvis en el comptador de «Current Pending Sector». Quan això passava provàvem de formatar a baix nivell el disc per posar el comptador a zero però vam més endavant tornava a saltar l'alerta i vam concloure que era un indicador de què caldria substituir-lo eventualment. Salut, Alex -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢠⠒⠀⣿⡁ Alex Muntada ⢿⡄⠘⠷⠚⠋ Debian Developer log.alexm.org ⠈⠳⣄ signature.asc Description: PGP signature
Re: smartd monitoritzant els vostres discs durs...
El Fri, 5 Mar 2021 23:31:11 +0100 Alex Muntada va escriure: > Hola Joan > > > Volia saber si valtros useu aquesta opció i si te alguna > > contrapartida... > > Als servidors físics que gestionàvem a la meva antiga feina > sempre posàvem smartmontools amb el dimoni corrent i teníem > alertes per correu i comprovacions amb nagios. > > > D'altra banda, veig quan quan es fa un anàlisi llarg, pot > > trigar molta estona. > > > > i per tant suposo que no és aquest tipus d'anàlisi els que fa > > el dimoni smartd, oi? > > Pot fer-ne de curts i de llargs però per defecte no en fa cap. > Tens una pila d'exemples a /etc/smartd.conf. Nosaltres no fèiem > tests fins que no es produïa cap alerta perquè no afectés el > rendiment durant el dia ni els backups durant la nit. Quan dius que no fèieu tests perquè primer esperàveu a una alerta, vols dir que smartd fa unes comprovacions DIFERENTS als testos? I son aquestes les que posen sobre avís? > > Salut, > Alex > > -- > ⢀⣴⠾⠻⢶⣦⠀ > ⣾⠁⢠⠒⠀⣿⡁ Alex Muntada > ⢿⡄⠘⠷⠚⠋ Debian Developer log.alexm.org > ⠈⠳⣄ > -- Joan Cervan i Andreu http://personal.calbasi.net "El meu paper no és transformar el món ni l'home sinó, potser, el de ser útil, des del meu lloc, als pocs valors sense els quals un món no val la pena viure'l" A. Camus i pels que teniu fe: "Déu no és la Veritat, la Veritat és Déu" Gandhi
Re: smartd monitoritzant els vostres discs durs...
El 5 de març de 2021 20:30:49 CET, Joan ha escrit: >En aquest howto: > >https://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu > >he vist que e spot fer que smartd estigui funcionant sempre com a >dimoni, i a part de passar info al syslog, també envii un mail si hi ha >algun problema. > >Volia saber si valtros useu aquesta opció i si te alguna >contrapartida... > >D'altra banda, veig quan quan es fa un anàlisi llarg, pot trigar molta >estona. Per exemple: sudo smartctl -t long /dev/sdb > >i per tant suposo que no és aquest tipus d'anàlisi els que fa el dimoni >smartd, oi? > >Fins ara, Fins on jo sé aquesta mena de tests (tant els curts com els llargs) fan les comprovacions només quan el disc està inactiu (idle). Per tant, no hauria d'afectar a l'operativa normal. Un altre tema és quan considera el disc que està "idle". Si ho fa massa aviat després de la darrera lectura/escriptura de l'usuari, aleshores sí que podria baixar el rendiment perquè aniria intercalant el "test" amb les operacions normals. En canvi, si s'assegura que l'usuari no està fent realment res abans de continuar amb el "test", probablement l'impacte en el rendiment serà petit. Orestes -- Enviat des del meu dispositiu Android amb el K-9 Mail. Disculpeu la brevetat.
Re: smartd monitoritzant els vostres discs durs...
Hola Joan > Volia saber si valtros useu aquesta opció i si te alguna > contrapartida... Als servidors físics que gestionàvem a la meva antiga feina sempre posàvem smartmontools amb el dimoni corrent i teníem alertes per correu i comprovacions amb nagios. > D'altra banda, veig quan quan es fa un anàlisi llarg, pot > trigar molta estona. > > i per tant suposo que no és aquest tipus d'anàlisi els que fa > el dimoni smartd, oi? Pot fer-ne de curts i de llargs però per defecte no en fa cap. Tens una pila d'exemples a /etc/smartd.conf. Nosaltres no fèiem tests fins que no es produïa cap alerta perquè no afectés el rendiment durant el dia ni els backups durant la nit. Salut, Alex -- ⢀⣴⠾⠻⢶⣦⠀ ⣾⠁⢠⠒⠀⣿⡁ Alex Muntada ⢿⡄⠘⠷⠚⠋ Debian Developer log.alexm.org ⠈⠳⣄ signature.asc Description: PGP signature
Re: smartd monitoritzant els vostres discs durs...
Jo tinc programat un test llarg mensual per la nit als servidors. Així en principi revisa tots els sectors. Després cal revisar els mails per comprovar com ha anat, es clar. Els test curts i suposo que el daemon comprova alguns paràmetres que es suposa que seran probables que provoquin o siguin símptoma d'una avaria. No se quanta sobrecarrega suposa tindre-ho sempre actiu. Potser cal que em repensi la estratègia que feia fins ara, altres opinions son benvingudes. Daniel El 5/3/21 a les 20:30, Joan ha escrit: En aquest howto: https://www.howtoforge.com/checking-hard-disk-sanity-with-smartmontools-debian-ubuntu he vist que e spot fer que smartd estigui funcionant sempre com a dimoni, i a part de passar info al syslog, també envii un mail si hi ha algun problema. Volia saber si valtros useu aquesta opció i si te alguna contrapartida... D'altra banda, veig quan quan es fa un anàlisi llarg, pot trigar molta estona. Per exemple: sudo smartctl -t long /dev/sdb i per tant suposo que no és aquest tipus d'anàlisi els que fa el dimoni smartd, oi? Fins ara,
Re: smartd
Bonjour, Le dimanche 19 octobre 2014, Philippe Delavalade a écrit... The following warning/error was logged by the smartd daemon: Device: /dev/sdc [SAT], 1 Currently unreadable (pending) sectors Device info: WDC WD5000AAKS-00V1A0, S/N:WD-WCAWF2033813, WWN:5-0014ee-1ad0956f1, FW:05.01D05, 500 GB Quelqu'un aurait-il un conseil ? Oui, penser à investir dans un autre disque. J'ai déjà eu des disques qui commencent par quelques unreadable (pending) sectors. Puis de plus en plus. Puis ça finit par sortir du raid et j'aime bien en avoir un sous la main immédiatement. -- jm -- Lisez la FAQ de la liste avant de poser une question : http://wiki.debian.org/fr/FrenchLists Pour vous DESABONNER, envoyez un message avec comme objet unsubscribe vers debian-user-french-requ...@lists.debian.org En cas de soucis, contactez EN ANGLAIS listmas...@lists.debian.org Archive: https://lists.debian.org/20141019100237.GA9358@espinasse
Re: smartd
Désolé pour la réponse hors-liste. Jean-Michel OLTRA dimanche 19 octobre à 12:02 Bonjour, Le dimanche 19 octobre 2014, Philippe Delavalade a écrit... The following warning/error was logged by the smartd daemon: Device: /dev/sdc [SAT], 1 Currently unreadable (pending) sectors Device info: WDC WD5000AAKS-00V1A0, S/N:WD-WCAWF2033813, WWN:5-0014ee-1ad0956f1, FW:05.01D05, 500 GB Quelqu'un aurait-il un conseil ? Oui, penser à investir dans un autre disque. J'ai déjà eu des disques qui commencent par quelques unreadable (pending) sectors. Puis de plus en plus. Puis ça finit par sortir du raid et j'aime bien en avoir un sous la main immédiatement. C'est peut-être un peu définitif :-) Le risque est-il de perdre le disque ? je ne fonctionne pas en raid, ce qui est peut-être encore plus grave... -- Ph. Delavalade -- Lisez la FAQ de la liste avant de poser une question : http://wiki.debian.org/fr/FrenchLists Pour vous DESABONNER, envoyez un message avec comme objet unsubscribe vers debian-user-french-requ...@lists.debian.org En cas de soucis, contactez EN ANGLAIS listmas...@lists.debian.org Archive: https://lists.debian.org/20141019103048.ga5...@messier31.home
Re: smartd
Salut, Le Sunday 19 Oct 2014 à 12:30:48 (+0200), Philippe Delavalade a écrit : C'est peut-être un peu définitif :-) Le risque est-il de perdre le disque ? Pas immédiatement, mais le risque est que la situation se dégrade, et que ça aille très vite alors. Avec le risque de se retrouver avec une partie ou tout le disque illisible. je ne fonctionne pas en raid, ce qui est peut-être encore plus grave... Oui, là le premier réflexe est de faire une sauvegarde des données immédiatement. -- « Ne me libère pas, je m'en charge » signature.asc Description: Digital signature
Re: smartd
Christophe Moille dimanche 19 octobre à 12:48 Salut, Le Sunday 19 Oct 2014 à 12:30:48 (+0200), Philippe Delavalade a écrit : C'est peut-être un peu définitif :-) Le risque est-il de perdre le disque ? Pas immédiatement, mais le risque est que la situation se dégrade, et que ça aille très vite alors. Avec le risque de se retrouver avec une partie ou tout le disque illisible. OK. Je prends note. je ne fonctionne pas en raid, ce qui est peut-être encore plus grave... Oui, là le premier réflexe est de faire une sauvegarde des données immédiatement. J'en fais tous les jours, donc pas de soucis de ce côté-là. -- Ph. Delavalade -- Lisez la FAQ de la liste avant de poser une question : http://wiki.debian.org/fr/FrenchLists Pour vous DESABONNER, envoyez un message avec comme objet unsubscribe vers debian-user-french-requ...@lists.debian.org En cas de soucis, contactez EN ANGLAIS listmas...@lists.debian.org Archive: https://lists.debian.org/20141019114026.ga5...@messier31.home
Re: smartd message
On Sat, 13 May 2006, Christian Christmann wrote: Device: /dev/hda, 1 Offline uncorrectable sectors You need to write over that sector, so that the HD can remap it. Look at the smart error log to know the sector number. Use the smartctl program to do it (man smartctl will tell you how). If your HD keep doing this, get rid of it. You *did* lose the information on that sector. -- One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot Henrique Holschuh -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: smartd message
Christian Christmann wrote: Hi, smartd generates permanently the following e-mail (sent to root) on my Debian Etch system: [snip] The following warning/error was logged by the smartd daemon: Device: /dev/hda, 1 Offline uncorrectable sectors I'd get a new HD. In my experience SMART errors only show up when a drive is on its way out. Michael -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: smartd Meldung
On Wednesday 19 April 2006 01:42, Florian wrote: Gibt es eigentlich eine liste um die Ergebnisse zu verstehen/übersetzen? In der c't 23/04 war ein laengerer Artikel zu dem Thema. Gruss, -mg- -- .--. |o_o | __ _Powered by |:_/ |/ / (_)___ __ __ __ // \ \ / / / // __ \/ / / / \/ / (| | )/ /__/ // / / / /_/ / /'\_ _/`\ //_//_/ /_/_/ /_/\_\ \___)=(___/ pgpPhtnwocZGR.pgp Description: PGP signature
Re: smartd Meldung
* Florian [EMAIL PROTECTED] wrote: Gibt es eigentlich eine liste um die Ergebnisse zu verstehen/übersetzen? Oder gibt es da die faustregel: je kleiner der wert desto kritischer wird es!? Mir hat dieser Artikel geholfen: http://www.linux-user.de/ausgabe/2004/10/056-smartmontools/ -- Haeufig gestellte Fragen und Antworten (FAQ): http://www.de.debian.org/debian-user-german-FAQ/ Zum AUSTRAGEN schicken Sie eine Mail an [EMAIL PROTECTED] mit dem Subject unsubscribe. Probleme? Mail an [EMAIL PROTECTED] (engl)
Re: smartd Meldung
Sven Hartge wrote: Ralph Stens [EMAIL PROTECTED] wrote: seit neuestem meldet smartd auf meinem Server folgendes : Prefailure: Seek_Time_Performance (8) changed to 244, 245, 246, 247 Wie habe ich die Meldung einzuschätzen ? Ist es kritisch und ich sollte die Festplatte umgehend austauschen oder kann ich sie laufen lassen und die Meldung erst einmal ignorieren (natürlich auch weiter beobachten) ? Wenn der Wert sich rapide der 0 nähert, dann ist es Zeit, dir Sorgen zu machen. Gibt es eigentlich eine liste um die Ergebnisse zu verstehen/übersetzen? Oder gibt es da die faustregel: je kleiner der wert desto kritischer wird es!? Ich habe bei einer Platte einen smartctrl-exit-status als warung und bin mir nicht sicher wie dringlich das ist. if finde bei SF.net nicht wirklich etwas passendes. S° gruß Florian -- Haeufig gestellte Fragen und Antworten (FAQ): http://www.de.debian.org/debian-user-german-FAQ/ Zum AUSTRAGEN schicken Sie eine Mail an [EMAIL PROTECTED] mit dem Subject unsubscribe. Probleme? Mail an [EMAIL PROTECTED] (engl)
Re: smartd Meldung
Ralph Stens [EMAIL PROTECTED] wrote: seit neuestem meldet smartd auf meinem Server folgendes : Prefailure: Seek_Time_Performance (8) changed to 244, 245, 246, 247 Wie habe ich die Meldung einzuschätzen ? Ist es kritisch und ich sollte die Festplatte umgehend austauschen oder kann ich sie laufen lassen und die Meldung erst einmal ignorieren (natürlich auch weiter beobachten) ? Wenn der Wert sich rapide der 0 nähert, dann ist es Zeit, dir Sorgen zu machen. S° -- Sven Hartge -- professioneller Unix-Geek Meine Gedanken im Netz: http://www.svenhartge.de/ -- Haeufig gestellte Fragen und Antworten (FAQ): http://www.de.debian.org/debian-user-german-FAQ/ Zum AUSTRAGEN schicken Sie eine Mail an [EMAIL PROTECTED] mit dem Subject unsubscribe. Probleme? Mail an [EMAIL PROTECTED] (engl)
Re: Smartd disque HS?
Le Sat, Apr 30, 2005 at 11:45:49PM +0200, Georges Roux écrivait/wrote: Bonjour, Smartd le daemon qui controle le disque viens de me mailer ceci: --- This email was generated by the smartd daemon running on: host name: mimosa DNS domain: [Unknown] NIS domain: (none) The following warning/error was logged by the smartd daemon: Device: /dev/hda, ATA error count increased from 1215 to 1216 [...] et dans syslog j'ai ça: Apr 29 10:37:28 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 10:37:28 mimosa smartd[3719]: Sending warning via mail to root ... Apr 29 10:37:28 mimosa smartd[3719]: Warning via mail to root: successful Apr 29 10:37:28 mimosa smartd[3719]: Device: /dev/hda, SMART Usage Attribute: 197 Current_Pending_Sector changed from 253 to 252 Apr 29 10:37:28 mimosa smartd[3719]: Device: /dev/hda, ATA error count increased from 1215 to 1216 Apr 29 10:37:28 mimosa smartd[3719]: Sending warning via mail to root ... Apr 29 10:37:28 mimosa smartd[3719]: Warning via mail to root: successful Apr 29 11:07:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 11:37:28 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 12:07:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 12:37:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 13:07:28 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors [...] Apr 30 01:07:28 mimosa smartd[3719]: Device: /dev/hda, starting scheduled Short Self-Test. Apr 30 01:37:27 mimosa smartd[3719]: Device: /dev/hda, SMART Usage Attribute: 196 Reallocated_Event_Count changed from 253 to 252 Apr 30 01:37:27 mimosa smartd[3719]: Device: /dev/hda, SMART Usage Attribute: 197 Current_Pending_Sector changed from 252 to 253 Apr 30 03:07:28 mimosa smartd[3719]: Device: /dev/hda, starting scheduled Long Self-Test. Cela veut il dire que mon disque vas me lacher? Je n'ai pas une grande experience en ce domaine, mais les 2 fois où j'ai eu une erreur disque comme ça, le disque suspect est mort en quelques jours au plus. Donc, essayez si possible de sauvegarder votre disque, et envisagez de le remplacer. Mon point de vue actuel est que les disques sont quasiment des consommables - à leur première défaillance, il faut les remplacer. Bon courage -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet aliases: basileattunesdotorg = bstarynkatnerimdotnet 8, rue de la Faïencerie, 92340 Bourg La Reine, France -- Pensez à lire la FAQ de la liste avant de poser une question : http://wiki.debian.net/?DebianFrench Pensez à rajouter le mot ``spam'' dans vos champs From et Reply-To: To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Smartd disque HS?
Le 01.05.2005 11:01:36, Basile STARYNKEVITCH a écrit : Le Sat, Apr 30, 2005 at 11:45:49PM +0200, Georges Roux écrivait/wrote: [ .. ] Cela veut il dire que mon disque vas me lacher? Je n'ai pas une grande experience en ce domaine, mais les 2 fois où j'ai eu une erreur disque comme ça, le disque suspect est mort en quelques jours au plus. Donc, essayez si possible de sauvegarder votre disque, et envisagez de le remplacer. Mon point de vue actuel est que les disques sont quasiment des consommables - à leur première défaillance, il faut les remplacer. Je vote pour : sauvegarde, poubell, remplacement, restauration Bon courage -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ J-L pgp5kTCeZF6Jv.pgp Description: PGP signature
Re: Smartd disque HS?
Basile STARYNKEVITCH wrote: Le Sat, Apr 30, 2005 at 11:45:49PM +0200, Georges Roux écrivait/wrote: Bonjour, Smartd le daemon qui controle le disque viens de me mailer ceci: --- This email was generated by the smartd daemon running on: host name: mimosa DNS domain: [Unknown] NIS domain: (none) The following warning/error was logged by the smartd daemon: Device: /dev/hda, ATA error count increased from 1215 to 1216 [...] Mon point de vue actuel est que les disques sont quasiment des consommables - à leur première défaillance, il faut les remplacer. je vote pour aussi. Bon courage -- dominix -- Pensez à lire la FAQ de la liste avant de poser une question : http://wiki.debian.net/?DebianFrench Pensez à rajouter le mot ``spam'' dans vos champs From et Reply-To: To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Smartd disque HS?
Georges Roux a écrit : Bonjour, Smartd le daemon qui controle le disque viens de me mailer ceci: --- This email was generated by the smartd daemon running on: host name: mimosa DNS domain: [Unknown] NIS domain: (none) The following warning/error was logged by the smartd daemon: Device: /dev/hda, ATA error count increased from 1215 to 1216 For details see host's SYSLOG (default: /var/log/messages). You can also use the smartctl utility for further investigation. No additional email messages about this problem will be sent. - et dans syslog j'ai ça: Apr 29 10:37:28 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 10:37:28 mimosa smartd[3719]: Sending warning via mail to root ... Apr 29 10:37:28 mimosa smartd[3719]: Warning via mail to root: successful Apr 29 10:37:28 mimosa smartd[3719]: Device: /dev/hda, SMART Usage Attribute: 197 Current_Pending_Sector changed from 253 to 252 Apr 29 10:37:28 mimosa smartd[3719]: Device: /dev/hda, ATA error count increased from 1215 to 1216 Apr 29 10:37:28 mimosa smartd[3719]: Sending warning via mail to root ... Apr 29 10:37:28 mimosa smartd[3719]: Warning via mail to root: successful Apr 29 11:07:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 11:37:28 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 12:07:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 12:37:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 13:07:28 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 13:37:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 14:07:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 14:37:28 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 15:07:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 15:37:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 16:07:28 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 16:37:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 17:07:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 17:37:28 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 18:07:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 18:37:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 19:07:28 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 19:37:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 20:07:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 20:37:28 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 21:07:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 21:37:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 22:07:28 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 22:37:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 23:07:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 29 23:37:28 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 30 00:07:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 30 00:37:27 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 30 01:07:28 mimosa smartd[3719]: Device: /dev/hda, 1 Currently unreadable (pending) sectors Apr 30 01:07:28 mimosa smartd[3719]: Device: /dev/hda, starting scheduled Short Self-Test. Apr 30 01:37:27 mimosa smartd[3719]: Device: /dev/hda, SMART Usage Attribute: 196 Reallocated_Event_Count changed from 253 to 252 Apr 30 01:37:27 mimosa smartd[3719]: Device: /dev/hda, SMART Usage Attribute: 197 Current_Pending_Sector changed from 252 to 253 Apr 30 03:07:28 mimosa smartd[3719]: Device: /dev/hda, starting scheduled Long Self-Test. Cela veut il dire que mon disque vas me lacher? Georges salut lance un : smartctl -t short /dev/hdx (remplace x par le disque a controler) puis : smartctl -l selftest /dev/hdx tu auras des infos sur letat general de ton disque et la durée tres theorique du temps qu'il lui resta à vivre il y a de multiples paramètres pour établir la bonne santé de ton disque dur , et s'inquieter d'un seul message d'avertissement peut apparaitre
RE: Smartd frequent attribute change
Does anybody else experience similar log events and if so, is there anything you did to stop this from happening? Thanks, Andre Replace the drive. The error means that the ECC (Error Correcting Code) of the Barracuda is kicking in to correct corruption in the drives on-board cache RAM. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]