On Thursday, 7 March 2019 10:10:53 GMT Peter Humphrey wrote:
> On Wednesday, 6 March 2019 16:31:27 GMT Laurence Perkins wrote:
> > On Fri, 2019-03-01 at 10:12 +0000, Peter Humphrey wrote:
> > > [OT]
> > > Evidence is mounting that the Atom box is in terminal decline. I get
> > > things like batches of files in the portage tree changing owner, and
> > > then
> > > when I correct that, long lists of supposedly locally changed ebuilds
> > > preventing syncing. And when I boot weekly into its little rescue system
> > > to backup the main system, the root filesystem remounts itself read-only
> > > while tar is running. Smartd recognises the SSD and runs daily tests,
> > > but
> > > reports no errors. No amount of wiping and reinstalling has helped so
> > > far.
> > 
> > What filesystem are you running and how old is the SSD?  That sounds
> > like some of the symptoms EXT4 had on early generation flash media
> > where its assumptions about what order writes would physically make it
> > to the disk in were wrong, leading to corruption.
> 
> The disk is a 64GB SanDisk SDSSDP device, which I bought five years ago to
> replace a failed spinning disk. All partitions are ext4 except /boot, which
> is ext2.
> 
> > So unless it was working correctly at some point in the past, try a
> > different filesystem.  EXT3 or BTRFS didn't have the same problems.
> 
> It was working just fine until recently.
> 
> > If it's just that the SSD is failing, then get a new one before
> > something important gets damaged and you have to redo the whole thing.
> 
> Everything on it is disposable.
> 
> The box is getting a bit long in the tooth: I bought it in November 2010.
> It's a single-core, 32-bit Atom N270 (not N2700). It doesn't owe me
> anything now, in spite of having cost £450 at the time. I don't know
> whether it's worth throwing any more money at it. On the other hand, I see
> Amazon are only asking for £20 for a small SSD.
> 
> The repeatability of some of the errors it throws makes me question whether
> the disk or something else is at fault. (What would cause a file system to
> be remounted read-only in the middle of its work?)

I can think of 3 things, but more learned M/L contributors may add to these:

1. The SATA connection has come loose.  With time and movement it can come 
(slightly) adrift.  Pushing it back in fully fixes this problem - also see No.
2 below.

2. The physical connector's contacts are beginning to oxidise.  Reseat the 
SATA cable connectors both on the drive and any ribbons on the MoBo.  This 
usualy cleans any oxidisation.

3. The AHCI driver is deploying energy saving measures (aka. Aggressive Link 
Power Management - ALPM).  Check the output of:

 cat /sys/class/scsi_host/host*/link_power_management_policy

If it doesn't say 'max_performance' you'll need to revisit your BIOS settings 
and also PCIEASPM settings in the kernel.

4. Finally, there is a chance the PSU is playing up.


1 & 2 above are more noticeable on spinning disks, but it is a matter of scale 
before SSDs are affected too.  If BIOS, kernel settings and drivers were not 
altered recently, then 1 & 2 merit attention in the first instance.


> I have a spare four-core, 64-bit Celeron box (I bought it for a purpose
> that's gone away). I've been wondering what to do with it, so maybe it can
> replace the Atom box. It's powerful enough to compile its own software,
> whereas the Atom needs help. Whichever I use, its job will be as a server
> of DNS, LAN mail, time and git. Maybe print too. Also it will fetch my
> ISP's POP mail and serve it over IMAP to this box.
> 
> > The self-test capability of storage media is almost universally
> > horrible and you generally don't get a failure report until your data
> > has already been lost.  If your SMART output gives you the raw
> > statistics on the device instead of just pass/fail then analyzing that
> > usually gives a better indication of whether something is about to go
> > wrong.
> 
> It seems to report only pass/fail, so that's not much help.
> 
> Decisions, decisions...

Do short/long smartctl tests report no errors, assuming the disk comes with 
S.M.A.R.T. capability?

-- 
Regards,
Mick

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to