Bug#867636: Bug#867634: linux-image-4.9.0-3-amd64: Repeated ESATA softreset failed messages - even after replacing most components.

2017-07-26 Thread Matthew Gillespie

eSATA cables were replaced, server has been replaced.


On 07/20/2017 08:01 PM, Ben Hutchings wrote:

Control: reopen -1
Control: tag -1 moreinfo

[Retrying this with the other address given]

On Fri, 2017-07-07 at 21:01 -0400, madsara wrote:
[...]

* What exactly did you do (or not do) that was effective (or
  ineffective)?
The following actions have been taken so far, none of which are 
successful:
- Upgraded Dell R210 II's BIOS
- Purchased an entirely new Sandisk TowerRAID TR8M+B
- Replaced the eSATA card - moving from SIL chipset to Marvell
- Replaced all hard drives with brand new 3TB drives.

Did you try replacing the eSATA cables yet?


- Upgraded OS from Debian Jessie to Debian Stretch
- Ensured BIOS was set for maximum performance and not power saving.
- Added noapic, acpi=off, and apm=off kernel parameter options.

These parameters were once useful with old hardware and kernel
versions, but now will usually only make things worse.


- Performed a full on-board health check (part of the Dell on-board 
management abilities), finding no issues.
- Performed an extensive memory test, finding no issues.

[...]

Ben.





Bug#867636: Bug#867634: linux-image-4.9.0-3-amd64: Repeated ESATA softreset failed messages - even after replacing most components.

2017-07-20 Thread Ben Hutchings
Control: reopen -1
Control: tag -1 moreinfo

[Retrying this with the other address given]

On Fri, 2017-07-07 at 21:01 -0400, madsara wrote:
[...]
>    * What exactly did you do (or not do) that was effective (or
>  ineffective)?
>   The following actions have been taken so far, none of which are 
> successful:
>   - Upgraded Dell R210 II's BIOS
>   - Purchased an entirely new Sandisk TowerRAID TR8M+B
>   - Replaced the eSATA card - moving from SIL chipset to Marvell
>   - Replaced all hard drives with brand new 3TB drives.

Did you try replacing the eSATA cables yet?

>   - Upgraded OS from Debian Jessie to Debian Stretch
>   - Ensured BIOS was set for maximum performance and not power saving.
>   - Added noapic, acpi=off, and apm=off kernel parameter options.

These parameters were once useful with old hardware and kernel
versions, but now will usually only make things worse.

>   - Performed a full on-board health check (part of the Dell on-board 
> management abilities), finding no issues.
>   - Performed an extensive memory test, finding no issues.

[...]

Ben.

-- 
Ben Hutchings
Larkinson's Law: All laws are basically false.



signature.asc
Description: This is a digitally signed message part


Bug#867634: linux-image-4.9.0-3-amd64: Repeated ESATA softreset failed messages - even after replacing most components.

2017-07-16 Thread Ben Hutchings
Control: tag -1 moreinfo

On Fri, 2017-07-07 at 21:01 -0400, madsara wrote:
[...]
>    * What exactly did you do (or not do) that was effective (or
>  ineffective)?
>   The following actions have been taken so far, none of which are 
> successful:
>   - Upgraded Dell R210 II's BIOS
>   - Purchased an entirely new Sandisk TowerRAID TR8M+B
>   - Replaced the eSATA card - moving from SIL chipset to Marvell
>   - Replaced all hard drives with brand new 3TB drives.

Did you try replacing the eSATA cables yet?

>   - Upgraded OS from Debian Jessie to Debian Stretch
>   - Ensured BIOS was set for maximum performance and not power saving.
>   - Added noapic, acpi=off, and apm=off kernel parameter options.

These parameters were once useful with old hardware and kernel
versions, but now will usually only make things worse.

>   - Performed a full on-board health check (part of the Dell on-board 
> management abilities), finding no issues.
>   - Performed an extensive memory test, finding no issues.
[...]

Ben.

-- 
Ben Hutchings
If the facts do not conform to your theory, they must be disposed of.



signature.asc
Description: This is a digitally signed message part


Bug#867634: linux-image-4.9.0-3-amd64: Repeated ESATA softreset failed messages - even after replacing most components.

2017-07-07 Thread madsara
Package: src:linux
Version: 4.9.30-2+deb9u2
Severity: normal

Dear Maintainer,

   * What led up to the situation?

On a hot day two drives in my eSATA connected Sandisk TowerRAID TR8M+B 
failed. After removing the drives I'm receiving repeated softreset messages - 
even with entirely new equipment (except for the server itself)

This system worked without issue for almost a year. Previous 
incarnations (another server with same NAS) worked for even longer.

The messages received in dmesg (and across the screen)  are:

[  258.182103] ata3: link is slow to respond, please be patient (ready=0)
[  282.190796] ata3: softreset failed (device not ready)
[  282.190862] ata3: limiting SATA link speed to 1.5 Gbps
[  282.190864] ata3: hard resetting link
[  282.914095] ata3: SATA link down (SStatus 100 SControl 310)
[  282.914104] ata3: EH complete
[  282.915391] ata3: exception Emask 0x10 SAct 0x0 SErr 0x404 action 0xe 
frozen
[  282.915460] ata3: irq_stat 0x8040, connection status changed
[  282.915522] ata3: SError: { CommWake DevExch }
[  282.915579] ata3: hard resetting link
[  290.005120] device vlan35 entered promiscuous mode
[  290.005132] device eth0 entered promiscuous mode
[  292.932185] ata3: softreset failed (device not ready)
[  292.932251] ata3: hard resetting link
[  299.197731] ata3: failed to reset engine (errno=-5)
[  303.429384] ata3: softreset failed (1st FIS failed)
[  303.429452] ata3: hard resetting link
[  310.194426] ata3: failed to reset engine (errno=-5)
[  338.896868] ata3: softreset failed (1st FIS failed)
[  338.896943] ata3: limiting SATA link speed to 1.5 Gbps
[  338.896945] ata3: hard resetting link
[  340.119629] ata3: SATA link down (SStatus 100 SControl 310)
[  340.119640] ata3: EH complete
[  340.120968] ata3: exception Emask 0x10 SAct 0x0 SErr 0x404 action 0xe 
frozen
[  340.121048] ata3: irq_stat 0x8040, connection status changed
[  340.121114] ata3: SError: { CommWake DevExch }
[  340.121179] ata3: hard resetting link
[  350.137287] ata3: softreset failed (device not ready)
[  350.137359] ata3: hard resetting link
[  356.402847] ata3: failed to reset engine (errno=-5)
[  360.626520] ata3: softreset failed (1st FIS failed)
[  360.626587] ata3: hard resetting link
[  372.094712] ata3: link is slow to respond, please be patient (ready=0)
[  395.594524] ata3: softreset failed (device not ready)
[  395.594592] ata3: limiting SATA link speed to 1.5 Gbps
[  395.594594] ata3: hard resetting link
[  396.317785] ata3: SATA link down (SStatus 100 SControl 310)
[  396.317797] ata3: EH complete
[  396.319185] ata3: exception Emask 0x10 SAct 0x0 SErr 0x404 action 0xe 
frozen
[  396.319263] ata3: irq_stat 0x8040, connection status changed
[  396.319326] ata3: SError: { CommWake DevExch }
[  396.319389] ata3: hard resetting link
[  402.579341] ata3: failed to reset engine (errno=-5)
[  406.818948] ata3: softreset failed (1st FIS failed)
[  406.819017] ata3: hard resetting link
[  416.844648] ata3: softreset failed (device not ready)



   * What exactly did you do (or not do) that was effective (or
 ineffective)?
The following actions have been taken so far, none of which are 
successful:
- Upgraded Dell R210 II's BIOS
- Purchased an entirely new Sandisk TowerRAID TR8M+B
- Replaced the eSATA card - moving from SIL chipset to Marvell
- Replaced all hard drives with brand new 3TB drives.
- Upgraded OS from Debian Jessie to Debian Stretch
- Ensured BIOS was set for maximum performance and not power saving.
- Added noapic, acpi=off, and apm=off kernel parameter options.
- Performed a full on-board health check (part of the Dell on-board 
management abilities), finding no issues.
- Performed an extensive memory test, finding no issues.

   * What was the outcome of this action?
None of the above have helped fix or alter this issue.

   * What outcome did you expect instead?
The timeouts are preventing the external eSATA drives from functioning.



-- Package-specific info:
** Version:
Linux version 4.9.0-3-amd64 (debian-ker...@lists.debian.org) (gcc version 6.3.0 
20170516 (Debian 6.3.0-18) ) #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26)

** Command line:
BOOT_IMAGE=/vmlinuz-4.9.0-3-amd64 root=/dev/mapper/qub4rt-rootfs ro noapic 
acpi=off apm=off

** Not tainted

** Kernel log:
Unable to read kernel log; any relevant messages should be attached

** Model information
sys_vendor: Dell Inc.
product_name: PowerEdge R210 II
product_version: 
chassis_vendor: Dell Inc.
chassis_version: 
bios_vendor: Dell Inc.
bios_version: 2.2.3
board_vendor: Dell Inc.
board_name: 03X6X0
board_version: A00

** Loaded modules:
nf_conntrack_ipv6
nf_defrag_ipv6
ip6table_filter
ip6_tables
ipt_MASQUERADE
nf_nat_masquerade_ipv4
xt_nat
iptable_nat
nf_nat_ipv4
nf_nat
xt_TCPMSS
nf_conntrack_ipv4
nf_defrag_ipv4
xt_mac
ipt_REJECT
nf_reject_ipv4
xfrm_user
xt_policy
xt_comment