Package: linux-image-3.16.0-0.bpo.4-amd64
Version: 3.16.7-ckt11-1~bpo70+1
Justification: causes serious data loss
Severity: critical
Subject: linux-image-3.16.0-0.bpo.4-amd64: pm80xx report failing drive but
mdadm doesn't set this drive failing
hello,
We enconter a serious bug related with a 6805H adapect controler on wich
8 drives were plugged.
Those drives were configured as raid6 with LVM.
The october the 4th pm80xx modules repport a failling disk but the disk
was not repported by mdadm as faulty, and the data became corrupted.
We are trying to reproduce the bug on non cruitical date but with no
succes for the moment. However, in my mind the data loss was big enough
for repporting the bug whatever.
Note :
(1) we have remounted the machine for data retrieval, adding an ASUS
controller card (so no relation to the signaled bug).
(2) may be related to bug 774583 as problems appear after a checkarray.
thanks
best regards
Xavier Quost
-- Package-specific info:
** Version:
Linux version 3.16.0-0.bpo.4-amd64 (debian-kernel@lists.debian.org) (gcc
version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian 3.16.7-ckt11-1~bpo70+1
(2015-06-08)
Package: mdadm
Version: 3.2.5-5
** Command line:
BOOT_IMAGE=/vmlinuz-3.16.0-0.bpo.4-amd64
root=UUID=23b97aa0-53d0-42c4-afd0-48f302de1b08 ro quiet
processor.max_cstate=1 idle=poll nox2apic intermap=off
** Tainted: WO (4608)
* Taint on warning.
* Out-of-tree module has been loaded.
** Kernel log:
Oct 4 00:57:01 nassli kernel: [3944417.125573] md: data-check of RAID
array md0
Oct 4 00:57:01 nassli kernel: [3944417.125576] md: minimum _guaranteed_
speed: 5 KB/sec/disk.
Oct 4 00:57:01 nassli kernel: [3944417.125577] md: using maximum
available idle IO bandwidth (but not more than 500 KB/sec) for
data-check.
Oct 4 00:57:01 nassli kernel: [3944417.125582] md: using 128k window,
over a total of 3907016192k.
Oct 4 08:41:31 nassli kernel: [3972277.588119] pm80xx mpi_sata_event
2689:SATA EVENT 0x23
Oct 4 08:41:31 nassli kernel: [3972277.588125] pm80xx
mpi_sata_completion 2373:SAS Address of IO Failure
Drive:5d1106ee7590
Oct 4 08:41:31 nassli kernel: [3972277.588315] pm80xx
mpi_sata_completion 2373:SAS Address of IO Failure
Drive:5d1106ee7590pm80xx mpi_sata_completion 2373:SAS Address of IO
Failure Drive:5d1106ee7590
Oct 4 08:41:31 nassli kernel: [3972277.588634] pm80xx
mpi_sata_completion 2373:SAS Address of IO Failure
Drive:5d1106ee7590pm80xx mpi_sata_completion 2373:SAS Address of IO
Failure Drive:5d1106ee7590
Oct 4 08:41:31 nassli kernel: [3972277.588945] pm80xx
mpi_sata_completion 2373:SAS Address of IO Failure
Drive:5d1106ee7590pm80xx mpi_sata_completion 2373:SAS Address of IO
Failure Drive:5d1106ee7590
Oct 4 08:41:31 nassli kernel: [3972277.589256] pm80xx
mpi_sata_completion 2373:SAS Address of IO Failure
Drive:5d1106ee7590pm80xx mpi_sata_completion 2373:SAS Address of IO
Failure Drive:5d1106ee7590
Oct 4 08:41:31 nassli kernel: [3972277.589563] pm80xx
mpi_sata_completion 2373:SAS Address of IO Failure
Drive:5d1106ee7590pm80xx mpi_sata_completion 2373:SAS Address of IO
Failure Drive:5d1106ee7590
Oct 4 08:41:31 nassli kernel: [3972277.589875] pm80xx
mpi_sata_completion 2373:SAS Address of IO Failure
Drive:5d1106ee7590pm80xx mpi_sata_completion 2373:SAS Address of IO
Failure Drive:5d1106ee7590
Oct 4 08:41:31 nassli kernel: [3972277.590186] pm80xx
mpi_sata_completion 2373:SAS Address of IO Failure
Drive:5d1106ee7590pm80xx mpi_sata_completion 2373:SAS Address of IO
Failure Drive:5d1106ee7590
Oct 4 08:41:57 nassli kernel: [3972277.590494] pm80xx
mpi_sata_completion 2373:SAS Address of IO Failure
Drive:5d1106ee7590pm80xx mpi_sata_completion 2373:SAS Address of IO
Failure Drive:5d1106ee7590
Oct 4 08:41:57 nassli kernel: [3972304.318093] sas: Enter
sas_scsi_recover_host busy: 30 failed: 30
Oct 4 08:41:57 nassli kernel: [3972304.318099] sas: trying to find task
0x8802b80d6440
Oct 4 08:41:57 nassli kernel: [3972304.318100] sas: sas_scsi_find_task:
aborting task 0x8802b80d6440
Oct 4 08:41:57 nassli kernel: [3972304.318254] pm80xx
mpi_sata_completion 2373:SAS Address of IO Failure
Drive:5d1106ee7590
Oct 4 08:41:57 nassli kernel: [3972304.318260] sas: sas_scsi_find_task:
task 0x8802b80d6440 is done
Oct 4 08:41:57 nassli kernel: [3972304.318261] sas:
sas_eh_handle_sas_errors: task 0x8802b80d6440 is done
Oct 4 08:41:57 nassli kernel: [3972304.318262] sas: trying to find task
0x8803d7ba6c00
Oct 4 08:41:57 nassli kernel: [3972304.318263] sas: sas_scsi_find_task:
aborting task 0x8803d7ba6c00
Oct 4 08:41:57 nassli kernel: [3972304.318402] pm80xx
mpi_sata_completion 2373:SAS Address of IO Failure
Drive:5d1106ee7590
Oct 4 08:41:57 nassli kernel: [3972304.318404] sas: sas_scsi_find_task:
task 0x8803d7ba6c00 is done
Oct 4 08:41:57 nassli kernel: [3972304.318405] sas:
sas_eh_handle_sas_errors: task 0x8803d7ba6c00 is done
Oct 4 08:41:57 nassli kernel: