Re: [gentoo-user] Weird harddisk problem: AHCI disks sometimes not found

2021-03-21 Thread Alexander Puchmayr
Hi there,

Thanks for all suggestions and answers so far.

I'm pretty sure it is not a hardware problem, because 
* Exchanging SATA cables does not affect the problem
* Using different SATA slots on the mainboard does not affect the problem
* Using different SATA power connectors does not affect the problem

I continued to experiment with different kernel versions and configs:
* Ubuntu-5.4.0-48-generic works
* sys-kernel/gentoo-sources-5.4.60 [self compiled and configured for a similar 
machine some time ago]: WORKS
* sys-kernel/gentoo-kernel-5.4.97 [default config] FAILS
* sys-kernel/gentoo-kernel-bin-5.4.97 FAILS
* sys-kernel/vanilla-sources-5.4.102 [same config as with 5.4.60] WORKS
* sys-kernel/gentoo-kernel-5.10.20 [default config] FAILS
* sys-kernel/gentoo-sources-5.10.20 [same config as with 5.4.60] WORKS

The common thing seems to be that my self-configured kernels work and the 
default dist-kernels fail. I checked the differences in the configs (/usr/src/
linux/.config) related to SATA or AHCI, and one candidate was 
CONFIG_SATA_MOBILE_LPM_POLICY, which was set to 3 (medium power save) in 
distkernel's config and 0 (keep seetings from firmware) in my self compiled 
kernels.

SOLUTION:
Adding CONFIG_SATA_MOBILE_LPM_POLICY=0 to /etc/kernel/config.d and recompiling 
the gentoo-kernel actually solved the problem. 

I assume the reason is an incompatibility between the link power modes (mode 
3) and the drives making the link to appear to be down.

Alex

Am Donnerstag, 11. März 2021, 20:39:04 CET schrieb Alexander Puchmayr:
> Hi there,
> 
> I have a weird harddisk detection problem which rises the questio: what does
> the gentoo-kernel make differently than the ubuntu kernel?
> 
> The system in question has 2 identical SSDs (Kingston SV300S3 60GB) and two
> identical HDDs (older Maxtor7V300F0 300GB) , all connected to SATA/AHCI
> ports; the HDDs are combined to a LVM-raid1 volume. SATA controller is a
> onboard SB7x on an Asus M3A78 mainboard in AHCI mode.
> 
> Only one of the two SSDs is attached at the same time to the system, the
> other one is disconnected. One contains a gentoo installation (just updated
> yesterday), the other one an Ubuntu LTS 20.04. This allows dual-.boot by
> switching connection cables.
> 
> When I connect the gentoo-SSD and boot it, BIOS finds all HDDs and the SSD,
> and starts booting; but gentoo does not recognize at least one of the HDDs
> (/dev/ sdc missing, dmesg shows link down on Sata-Interface
> . Going back to the bios shows that even BIOS does not recognize the disk
> anymore. A full powercycle (pressing reset button is not sufficent) to make
> BIOS to recognize the disks again.
> 
> Doing the same with the Ubuntu-Disk works absolutely fine, all HDDs are
> recognized and the raid is working fine, not a single time that one of the
> disks was not recognized.
> 
> Without the Ubuntu observation I'd say its a hardware problem and the old
> HDDs are simply beyond their age, but why are they working in ubuntu and
> not in gentoo? And what is it doing with BIOS/Harddisk that even Bios does
> not find it anymore? I need a full powercycle to make bios find it again.
> This  indicates a gentoo kernel problem, and I have no idea where to start
> looking, and AFAIK there's nothing much to configure a SATA/AHCI drive.
> 
> Any ideas?
> 
> Thanks
>   Alex
> 
> PS:
> Sys-kernel/gentoo-kernel-5.4.97, default configuration
> Hardware:
> 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] RS780 Host Bridge
> 00:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780/RS880 PCI to
> PCI bridge (int gfx)
> 00:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780 PCI to PCI
> bridge (PCIE port 2)
> 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 SATA Controller [AHCI mode]
> 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 USB OHCI0 Controller
> 00:12.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB
> OHCI1 Controller
> 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 USB EHCI Controller
> 00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 USB OHCI0 Controller
> 00:13.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB
> OHCI1 Controller
> 00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 USB EHCI Controller
> 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller
> (rev 3a)
> 00:14.1 IDE interface: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 IDE Controller
> 00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia
> (Intel HDA)
> 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0
> LPC host controller
> 00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI
> Bridge
> 00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 USB OHCI2 

Re: [gentoo-user] Weird harddisk problem: AHCI disks sometimes not found

2021-03-12 Thread Mark Knecht
On Thu, Mar 11, 2021 at 12:39 PM Alexander Puchmayr <
alexander.puchm...@linznet.at> wrote:
>
> Hi there,

> Any ideas?
>

One other point that I'd make on this subject is that even if you had the
same kernel config file there
could be differences in the tool chain that are causing your problem. I
suspect researching that
sort of cause would use huge amounts of time and likely never lead to a
real understanding.

You can, of course, build your own kernel on Ubuntu using Gentoo source
code & config file and see
whether your new Ubuntu kernel shows the problem or acts like the provided
kernel. That might actually
produce more forward progress should you not find a more simple solution.
At least that would
presumably produce systems with the same things built in vs modules.

I would probably build vanilla-sources on the gentoo side to see if it's
Gentoo patching.

Good luck,
Mark


Re: [gentoo-user] Weird harddisk problem: AHCI disks sometimes not found

2021-03-11 Thread antlists

On 11/03/2021 19:39, Alexander Puchmayr wrote:

Only one of the two SSDs is attached at the same time to the system, the other
one is disconnected. One contains a gentoo installation (just updated
yesterday), the other one an Ubuntu LTS 20.04. This allows dual-.boot by
switching connection cables.


By switching cables. Is that moving the cables from one drive to the 
other? Or by disconnecting one drive from the mobo, and plugging in the 
other? Or what?


A pretty recent mobo I've got says that certain ports are incompatible, 
so for example if I plug in a video card, certain sata ports disappear, 
or if I use NVMe, something else goes ...


Could it be you have a collision like that, if your two SSDs don't end 
up plugged into the exact same SATA port (or whatever it is).


Cheers,
Wol



Re: [gentoo-user] Weird harddisk problem: AHCI disks sometimes not found

2021-03-11 Thread Grant Taylor

On 3/11/21 12:39 PM, Alexander Puchmayr wrote:

Hi there,


Hi,

I have a weird harddisk detection problem which rises the questio: 
what does the gentoo-kernel make differently than the ubuntu kernel?


Probably multiple things.  They probably have configurations that are at 
least slightly different.  I wouldn't be surprised if there is slightly 
different levels of patching too.


My understanding is that gentoo-kernel differs slightly from a vanilla 
kernel source.



Without the Ubuntu observation I'd say its a hardware problem


I'd still be inclined to question hardware.  But I agree that difference 
in behavior based on different software is suspicious.  I wonder if the 
Gentoo kernel is tickling a bug in the drive's firmware.


and the old HDDs are simply beyond their age, but why are they working 
in ubuntu and not in gentoo?


I don't think that older drives would fail in the way that you are 
describing.


And what is it doing with BIOS/Harddisk that even Bios does not find 
it anymore?


That sounds to me like the drive itself is misbehaving and not 
responding the way the BIOS expects.



I need a full powercycle to make bios find it again.


That really sounds like the drive is having a problem.  Or that the 
Gentoo kernel is inducing the drive into a state that is a problem.


What happens if you unplug power and data cables from the drive and then 
reconnect them?  Does the BIOS then see the drive?


I'm wondering if it's the drive and / or controller that's getting wedged.

This indicates a gentoo kernel problem, and I have no idea where 
to start looking, and AFAIK there's nothing much to configure a 
SATA/AHCI drive.


As Mark indicated, you should be able to compare kernel configs.

I don't remember hearing about such a bug.  I wonder if the Gentoo 
kernel is trying to do something slightly different and tickling a 
subtle bug that is causing the drive and / or controller to lock up.


I'd think that it would be easy to remove power and data cables from the 
drive while the computer is powered on to see if that also revives the 
drive.



Any ideas?


Not really.  Just threads to chase.



--
Grant. . . .
unix || die



Re: [gentoo-user] Weird harddisk problem: AHCI disks sometimes not found

2021-03-11 Thread Mark Knecht
On Thu, Mar 11, 2021 at 12:39 PM Alexander Puchmayr <
alexander.puchm...@linznet.at> wrote:
>
> Hi there,
>
> I have a weird harddisk detection problem which rises the questio: what
does
> the gentoo-kernel make differently than the ubuntu kernel?
>
> The system in question has 2 identical SSDs (Kingston SV300S3 60GB) and
two
> identical HDDs (older Maxtor7V300F0 300GB) , all connected to SATA/AHCI
ports;
> the HDDs are combined to a LVM-raid1 volume. SATA controller is a onboard
SB7x
> on an Asus M3A78 mainboard in AHCI mode.
>
> Only one of the two SSDs is attached at the same time to the system, the
other
> one is disconnected. One contains a gentoo installation (just updated
> yesterday), the other one an Ubuntu LTS 20.04. This allows dual-.boot by
> switching connection cables.
>
> When I connect the gentoo-SSD and boot it, BIOS finds all HDDs and the
SSD, and
> starts booting; but gentoo does not recognize at least one of the HDDs
(/dev/
> sdc missing, dmesg shows link down on Sata-Interface
> . Going back to the bios shows that even BIOS does not recognize the disk
> anymore. A full powercycle (pressing reset button is not sufficent) to
make BIOS
> to recognize the disks again.
>
> Doing the same with the Ubuntu-Disk works absolutely fine, all HDDs are
> recognized and the raid is working fine, not a single time that one of the
> disks was not recognized.
>
> Without the Ubuntu observation I'd say its a hardware problem and the old
HDDs
> are simply beyond their age, but why are they working in ubuntu and not in
> gentoo? And what is it doing with BIOS/Harddisk that even Bios does not
find it
> anymore? I need a full powercycle to make bios find it again. This
 indicates a
> gentoo kernel problem, and I have no idea where to start looking, and
AFAIK
> there's nothing much to configure a SATA/AHCI drive.
>
> Any ideas?
>
> Thanks
> Alex
>
> PS:
> Sys-kernel/gentoo-kernel-5.4.97, default configuration
> Hardware:
> 00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] RS780 Host Bridge
> 00:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780/RS880 PCI to
PCI
> bridge (int gfx)
> 00:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780 PCI to PCI
bridge
> (PCIE port 2)
> 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/
> SB9x0 SATA Controller [AHCI mode]
> 00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/
> SB9x0 USB OHCI0 Controller
> 00:12.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB
OHCI1
> Controller
> 00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/
> SB9x0 USB EHCI Controller
> 00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/
> SB9x0 USB OHCI0 Controller
> 00:13.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB
OHCI1
> Controller
> 00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/
> SB9x0 USB EHCI Controller
> 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus
Controller
> (rev 3a)
> 00:14.1 IDE interface: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
> SB9x0 IDE Controller
> 00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia
> (Intel HDA)
> 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/SB9x0
> LPC host controller
> 00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to
PCI
> Bridge
> 00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI]
SB7x0/SB8x0/
> SB9x0 USB OHCI2 Controller
> 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
[Athlon64/Opteron]
> HyperTransport Technology Configuration
> 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
[Athlon64/Opteron]
> Address Map
> 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
[Athlon64/Opteron]
> DRAM Controller
> 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] K8
[Athlon64/Opteron]
> Miscellaneous Control
> 01:05.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
> RS780 [Radeon HD 3200]
> 01:05.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] RS780 HDMI
Audio
> [Radeon 3000/3100 / HD 3200/3300]
> 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8111/8168/8411
> PCI Express Gigabit Ethernet Controller (rev 02)
>
>
>

I'm going to assume that you built your Gentoo kernel and have the config
file.

Ubuntu ships the config file along with whatever kernel you are running
which you can obtain with

less /boot/config-$(uname -r)

Ubuntu 'tends' to ship everything as a module and ships nearly every
module vs your Gentoo kernel where you may be building things into
the kernel. You should be able to do a diff on the two config files as
a starting point assuming you are using the same kernel version.

lsmod should give you an idea what modules are loaded for each kernel.

HTH,
Mark


[gentoo-user] Weird harddisk problem: AHCI disks sometimes not found

2021-03-11 Thread Alexander Puchmayr
Hi there,

I have a weird harddisk detection problem which rises the questio: what does 
the gentoo-kernel make differently than the ubuntu kernel?

The system in question has 2 identical SSDs (Kingston SV300S3 60GB) and two 
identical HDDs (older Maxtor7V300F0 300GB) , all connected to SATA/AHCI ports; 
the HDDs are combined to a LVM-raid1 volume. SATA controller is a onboard SB7x 
on an Asus M3A78 mainboard in AHCI mode.

Only one of the two SSDs is attached at the same time to the system, the other 
one is disconnected. One contains a gentoo installation (just updated 
yesterday), the other one an Ubuntu LTS 20.04. This allows dual-.boot by 
switching connection cables.

When I connect the gentoo-SSD and boot it, BIOS finds all HDDs and the SSD, and 
starts booting; but gentoo does not recognize at least one of the HDDs (/dev/
sdc missing, dmesg shows link down on Sata-Interface
. Going back to the bios shows that even BIOS does not recognize the disk 
anymore. A full powercycle (pressing reset button is not sufficent) to make 
BIOS 
to recognize the disks again.

Doing the same with the Ubuntu-Disk works absolutely fine, all HDDs are 
recognized and the raid is working fine, not a single time that one of the 
disks was not recognized.

Without the Ubuntu observation I'd say its a hardware problem and the old HDDs 
are simply beyond their age, but why are they working in ubuntu and not in 
gentoo? And what is it doing with BIOS/Harddisk that even Bios does not find it 
anymore? I need a full powercycle to make bios find it again. This  indicates a 
gentoo kernel problem, and I have no idea where to start looking, and AFAIK 
there's nothing much to configure a SATA/AHCI drive.

Any ideas?

Thanks
Alex

PS:
Sys-kernel/gentoo-kernel-5.4.97, default configuration
Hardware: 
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] RS780 Host Bridge
00:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780/RS880 PCI to PCI 
bridge (int gfx)
00:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780 PCI to PCI bridge 
(PCIE port 2)
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
SB9x0 SATA Controller [AHCI mode]
00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
SB9x0 USB OHCI0 Controller
00:12.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB OHCI1 
Controller
00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
SB9x0 USB EHCI Controller
00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
SB9x0 USB OHCI0 Controller
00:13.1 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0 USB OHCI1 
Controller
00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
SB9x0 USB EHCI Controller
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 SMBus Controller 
(rev 3a)
00:14.1 IDE interface: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
SB9x0 IDE Controller
00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 Azalia 
(Intel HDA)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 
LPC host controller
00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] SBx00 PCI to PCI 
Bridge
00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/
SB9x0 USB OHCI2 Controller
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] K8 [Athlon64/Opteron] 
Address Map
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] K8 [Athlon64/Opteron] 
DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
01:05.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
RS780 [Radeon HD 3200]
01:05.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] RS780 HDMI Audio 
[Radeon 3000/3100 / HD 3200/3300]
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 
PCI Express Gigabit Ethernet Controller (rev 02)