Thanks Chris,
You are right, this is undeniably a support request for a very
particular situation.
I will convert it to a question and pursue the hardware-related (PSU,
cabling, SSD, etc) investigation.
** Changed in: linux (Ubuntu)
Status: Incomplete => Invalid
** Converted to question:
https://answers.launchpad.net/ubuntu/+source/linux/+question/687856
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1858784
Title:
Read-only filesystem
Status in linux package in Ubuntu:
Invalid
Bug description:
Some of our ubuntu machines (16.04 on 4.15 kernel) are suddenly
turning into a read-only filesystem after approx. 5 minutes operation:
The error is the following:
{{{
Jan 7 13:26:12 lj000601 kernel [ 311.818652] ata1.00: READ LOG DMA EXT
failed, trying PIO
Jan 7 13:26:12 lj000601 kernel [ 311.823232] ata1.00: exception Emask 0x0
SAct 0x10000 SErr 0x0 action 0x0
Jan 7 13:26:12 lj000601 kernel [ 311.823237] ata1.00: irq_stat 0x40000008
Jan 7 13:26:12 lj000601 kernel [ 311.823242] ata1.00: failed command: READ
FPDMA QUEUED
Jan 7 13:26:12 lj000601 kernel [ 311.823250] ata1.00: cmd
60/08:80:38:1b:c1/00:00:02:00:00/40 tag 16 ncq dma 4096 in
Jan 7 13:26:12 lj000601 kernel [ 311.823250] res
41/40:00:38:1b:c1/00:00:02:00:00/00 Emask 0x409 (media error) <F>
Jan 7 13:26:12 lj000601 kernel [ 311.823254] ata1.00: status: { DRDY ERR }
Jan 7 13:26:12 lj000601 kernel [ 311.823257] ata1.00: error: { UNC }
Jan 7 13:26:12 lj000601 kernel [ 311.828470] ata1.00: configured for
UDMA/133
Jan 7 13:26:12 lj000601 kernel [ 311.829567] sd 0:0:0:0: [sda] tag#16
FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jan 7 13:26:12 lj000601 kernel [ 311.829571] sd 0:0:0:0: [sda] tag#16 Sense
Key : Medium Error [current]
Jan 7 13:26:12 lj000601 kernel [ 311.829575] sd 0:0:0:0: [sda] tag#16 Add.
Sense: Unrecovered read error - auto reallocate failed
Jan 7 13:26:12 lj000601 kernel [ 311.829579] sd 0:0:0:0: [sda] tag#16 CDB:
Read(10) 28 00 02 c1 1b 38 00 00 08 00
Jan 7 13:26:12 lj000601 kernel [ 311.829582] print_req_error: I/O error,
dev sda, sector 46209848
Jan 7 13:26:12 lj000601 kernel [ 311.829615] EXT4-fs error (device sda1):
ext4_find_entry:1454: inode #1444593: comm updatedb.mlocat: reading directory
lblock 0
Jan 7 13:26:12 lj000601 kernel [ 311.829617] ata1: EH complete
Jan 7 13:26:12 lj000601 kernel [ 311.830654] Aborting journal on device
sda1-8.
Jan 7 13:26:12 lj000601 kernel [ 311.831394] EXT4-fs (sda1): Remounting
filesystem read-only
Jan 7 13:26:12 lj000601 kernel [ 311.831407] EXT4-fs error (device sda1):
ext4_journal_check_start:61: Detected aborted journal
}}}
PS: see further details in kernel.log
The machines have moderated disk access rates, they are retail point
of sale (graphical interface, internal web server, local postgres and
several USB devices), nothing terribly complex.
The recovery process is laborious, requiring local intervention to run
fsck on the faulty block. Then it comes back as if nothing happened,
for a while though, because we are starting seeing the issue
resurfacing.
The easy conclusion is hardware defect, but the problem happen in a
wide range to SSDs manufacturers and level of usage, as seen in the
smartctl.txt attached.
Looking forward to any hints on debugging this problem further.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1858784/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp