Processed: Re: Bug#1010365: linux: failure to boot on Raspberry Pi Compute Module 4 (black screen)

2022-04-30 Thread Debian Bug Tracking System
Processing control commands:

> tag -1 upstream
Bug #1010365 [src:linux] linux: failure to boot on Raspberry Pi Compute Module 
4 (black screen)
Added tag(s) upstream.
> forwarded -1 https://bugzilla.kernel.org/show_bug.cgi?id=215925
Bug #1010365 [src:linux] linux: failure to boot on Raspberry Pi Compute Module 
4 (black screen)
Set Bug forwarded-to-address to 
'https://bugzilla.kernel.org/show_bug.cgi?id=215925'.

-- 
1010365: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1010365
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#1010365: linux: failure to boot on Raspberry Pi Compute Module 4 (black screen)

2022-04-30 Thread Cyril Brulebois
Control: tag -1 upstream
Control: forwarded -1 https://bugzilla.kernel.org/show_bug.cgi?id=215925

Hi Bjørn,

Bjørn Mork  (2022-04-30):
> But that's a merge commit. Not likely the real cuplrit, unless there's
> a merge bug.
> 
> I looked briefly at what was merged there, and I believe this commit
> stands out as suspicious:
> 
> bjorn@miraculix:/usr/local/src/git/linux$ git show f59f6aaead97
> commit f59f6aaead975f0ec4d8ff2d59c4ffb8cf0127b2
> Author: Arnd Bergmann 
> Date:   Mon Nov 22 23:21:56 2021 +0100
> 
> mmc: bcm2835: stop setting chan_config->slave_id

Yeah, I skipped a bunch of details in my last mail since I've tried
various things (including reverting that one I spotted, plus the few
commits around it since it was part of removing that field altogether)
but didn't get any consistent results.

My methodology was probably fragile since I worked incrementally, and I
suppose I got some wires crossed at some point. Sorry for the confusion.


I've redone this entirely, and here are better (and reproducible, this
time) findings:

 - 830aa6f29f07a4e2f1a947dfa72b3ccddb46dd21 breaks the boot, leading to
   a kernel panic very early in the boot process; I'm seeing the trace
   on the screen, not on the serial console. It involves the modified
   brcm_pcie_driver_init() function, so that's quite consistent.

 - 87c71931633bd15e9cfd51d4a4d9cd685e8cdb55 is the last commit
   exhibiting the kernel panic (further in that branch, before it gets
   merged into mainline).

 - 88db8458086b1dcf20b56682504bdb34d2bca0e2 is the last commit that lets
   the CM4 boots properly.

 - d0a231f01e5b25bacd23e6edc7c979a18a517b2b, which is the merge of the
   last two aforementioned commits, is the first one that results in
   a completely black screen (no kernel panic displayed), and still
   nothing on the serial console. It seems to me that the kernel panic
   escalates into a more serious issue after this merge. I note there
   are conflict resolutions about drivers/pci/controller/pcie-brcmstb.c
   in that commit.


No luck with latest master. I've filed this upstream (see link above).


Cheers,
-- 
Cyril Brulebois -- Debian Consultant @ DEBAMAX -- https://debamax.com/


signature.asc
Description: PGP signature


Bug#1010365: linux: failure to boot on Raspberry Pi Compute Module 4 (black screen)

2022-04-30 Thread Bjørn Mork
Cyril Brulebois  writes:

> Cyril Brulebois  (2022-04-29):
>> > I'll try and pinpoint when it broke using the various intermediary
>> > versions:
>> > 
>> >  - 5.17~rc3-1~exp1
>> 
>> The first attempt was sufficient: it breaks as early as that version.
>
> Using the same base image as before, and only updating the kernel: I've
> tested upstream builds, starting from the .config found in the Debian
> 5.16.18-1 package, using oldconfig and accepting everything by default:
>
>  - v5.16 is confirmed a first good;
>  - v5.17-rc1 is confirmed a first bad;
>  - the culprit seems to be 3ceff4ea07410763d5d4cccd60349bf7691e7e61

But that's a merge commit. Not likely the real cuplrit, unless there's a
merge bug.

I looked briefly at what was merged there, and I believe this commit
stands out as suspicious:

bjorn@miraculix:/usr/local/src/git/linux$ git show f59f6aaead97
commit f59f6aaead975f0ec4d8ff2d59c4ffb8cf0127b2
Author: Arnd Bergmann 
Date:   Mon Nov 22 23:21:56 2021 +0100

mmc: bcm2835: stop setting chan_config->slave_id

The field is not interpreted by the DMA engine driver, as all the data
is passed from devicetree instead. Remove the assignment so the field
can eventually be deleted.

Reviewed-by: Nicolas Saenz Julienne 
Signed-off-by: Arnd Bergmann 
Acked-by: Ulf Hansson 
Acked-by: Mark Brown 
Link: https://lore.kernel.org/r/2021112203.4103644-5-a...@kernel.org
Signed-off-by: Vinod Koul 

diff --git a/drivers/mmc/host/bcm2835.c b/drivers/mmc/host/bcm2835.c
index 8c2361e66277..463b707d9e99 100644
--- a/drivers/mmc/host/bcm2835.c
+++ b/drivers/mmc/host/bcm2835.c
@@ -1293,14 +1293,12 @@ static int bcm2835_add_host(struct bcm2835_host *host)
 
host->dma_cfg_tx.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
host->dma_cfg_tx.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
-   host->dma_cfg_tx.slave_id = 13; /* DREQ channel */
host->dma_cfg_tx.direction = DMA_MEM_TO_DEV;
host->dma_cfg_tx.src_addr = 0;
host->dma_cfg_tx.dst_addr = host->phys_addr + SDDATA;
 
host->dma_cfg_rx.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
host->dma_cfg_rx.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
-   host->dma_cfg_rx.slave_id = 13; /* DREQ channel */
host->dma_cfg_rx.direction = DMA_DEV_TO_MEM;
host->dma_cfg_rx.src_addr = host->phys_addr + SDDATA;
host->dma_cfg_rx.dst_addr = 0;


But I'm basing that only on it being related to the bcm28/27xx SoCs and
a bit unexpected in the sound merge...  I cannot explain why this mmc
host driver change should affect your display.  Could be completely
wrong.  But migt be worth testing?



Bjørn



Bug#1010365: linux: failure to boot on Raspberry Pi Compute Module 4 (black screen)

2022-04-29 Thread Cyril Brulebois
Cyril Brulebois  (2022-04-29):
> > I'll try and pinpoint when it broke using the various intermediary
> > versions:
> > 
> >  - 5.17~rc3-1~exp1
> 
> The first attempt was sufficient: it breaks as early as that version.

Using the same base image as before, and only updating the kernel: I've
tested upstream builds, starting from the .config found in the Debian
5.16.18-1 package, using oldconfig and accepting everything by default:

 - v5.16 is confirmed a first good;
 - v5.17-rc1 is confirmed a first bad;
 - the culprit seems to be 3ceff4ea07410763d5d4cccd60349bf7691e7e61


Here's the git bisect log:

git bisect start
# good: [df0cc57e057f18e44dac8e6c18aba47ab53202f9] Linux 5.16
git bisect good df0cc57e057f18e44dac8e6c18aba47ab53202f9
# bad: [e783362eb54cd99b2cac8b3a9aeac942e6f6ac07] Linux 5.17-rc1
git bisect bad e783362eb54cd99b2cac8b3a9aeac942e6f6ac07
# good: [fef8dfaea9d6c444b6c2174b3a2b0fca4d226c5e] Merge tag 
'regulator-v5.17' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
git bisect good fef8dfaea9d6c444b6c2174b3a2b0fca4d226c5e
# bad: [3ceff4ea07410763d5d4cccd60349bf7691e7e61] Merge tag 
'sound-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect bad 3ceff4ea07410763d5d4cccd60349bf7691e7e61
# good: [57ea81971b7296b42fc77424af44c5915d3d4ae2] Merge tag 'usb-5.17-rc1' 
of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
git bisect good 57ea81971b7296b42fc77424af44c5915d3d4ae2
# good: [feb7a43de5ef625ad74097d8fd3481d5dbc06a59] Merge tag 
'irq-msi-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good feb7a43de5ef625ad74097d8fd3481d5dbc06a59
# good: [10674ca9ea02491fd3f8ffe303861b7a6837994b] ASoC/SoundWire: improve 
suspend flows and use set_stream() instead of set_tdm_slots() for HDAudio
git bisect good 10674ca9ea02491fd3f8ffe303861b7a6837994b
# good: [c77b1f8a8faeeba43c694d9d09d0b25a4f52cf37] scsi: mpi3mr: Bump 
driver version to 8.0.0.61.0
git bisect good c77b1f8a8faeeba43c694d9d09d0b25a4f52cf37
# good: [f66229aa355f7e0dc0dc20cbc1f4d45c3176eed2] Merge tag 'asoc-v5.17-2' 
of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
git bisect good f66229aa355f7e0dc0dc20cbc1f4d45c3176eed2
# good: [59aa7fcfe2e44afbe9736e5cfa941699021d6957] IB/mthca: Use 
memset_startat() for clearing mpt_entry
git bisect good 59aa7fcfe2e44afbe9736e5cfa941699021d6957
# good: [18451db82ef7f943c60a7fce685f16172bda5106] RDMA/core: Calculate UDP 
source port based on flow label or lqpn/rqpn
git bisect good 18451db82ef7f943c60a7fce685f16172bda5106
# good: [1f43e5230aebb17aea35238dc26e297a61095ac0] mailbox: qcom-ipcc: 
Support more IPCC instance
git bisect good 1f43e5230aebb17aea35238dc26e297a61095ac0
# good: [747c19eb7539b5e6bb15ed57a0a14ebf9f3adb8e] Merge tag 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
git bisect good 747c19eb7539b5e6bb15ed57a0a14ebf9f3adb8e
# good: [e1a7aa25ff45636a6c1930bf2430c8b802e93d9c] Merge tag 'scsi-misc' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
git bisect good e1a7aa25ff45636a6c1930bf2430c8b802e93d9c
# good: [19980aa10d2d944ed8fe345ce2eb87c2cb4bedf8] ALSA: hda: 
intel-dsp-config: add JasperLake support
git bisect good 19980aa10d2d944ed8fe345ce2eb87c2cb4bedf8
# good: [081c73701ef0c2a4f6a127da824a641ae6505fbe] ALSA: hda: 
intel-dsp-config: reorder the config table
git bisect good 081c73701ef0c2a4f6a127da824a641ae6505fbe
# first bad commit: [3ceff4ea07410763d5d4cccd60349bf7691e7e61] Merge tag 
'sound-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound


I'll try and find out more in a couple of hours, and get in touch with
upstream.


Cheers,
-- 
Cyril Brulebois -- Debian Consultant @ DEBAMAX -- https://debamax.com/


signature.asc
Description: PGP signature


Processed: Re: Bug#1010365: linux: failure to boot on Raspberry Pi Compute Module 4 (black screen)

2022-04-29 Thread Debian Bug Tracking System
Processing control commands:

> found -1 5.17~rc3-1~exp1
Bug #1010365 [src:linux] linux: failure to boot on Raspberry Pi Compute Module 
4 (black screen)
Marked as found in versions linux/5.17~rc3-1~exp1.

-- 
1010365: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1010365
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#1010365: linux: failure to boot on Raspberry Pi Compute Module 4 (black screen)

2022-04-29 Thread Cyril Brulebois
Control: found -1 5.17~rc3-1~exp1

Cyril Brulebois  (2022-04-29):
> The usual start-up rainbow is displayed, the screen turns to black and
> nothing happens. My first stop was trying to downgrade the bootloader
> (shipped by the raspi-firmware package) to the bullseye's version, but
> that didn't help.
> 
> Then I moved to starting from a bullseye image (which boots), upgrading
> the raspi-firmware package, that still boots.
> 
> Then I deployed 5.16.18-1 (from snapshot.debian.org), that still boots.
> 
> Then I deployed 5.17.3-1, and it broke booting.
> 
> I'll try and pinpoint when it broke using the various intermediary
> versions:
> 
>  - 5.17~rc3-1~exp1

The first attempt was sufficient: it breaks as early as that version.

> and then try to figure out what broke exactly. Contrary to my earlier
> efforts to introduce support for that hardware a few months ago, I
> haven't been following upstream changes recently, so I'll need to catch
> up.

Checking the upstream diff, nothing obvious on the DTB side. Trying to
use 5.16.18-1's DTB with 5.17~rc3-1~exp1 kernel didn't help anyway.

I've also tried latest mainline: v5.18-rc4-192-g38d741cb70b3

built with:

cp ~/config-5.17.0-1-arm64 .config
time PATH=/usr/lib/ccache:$PATH make ARCH=arm64 
CROSS_COMPILE=aarch64-linux-gnu- oldconfig   # accept everything
time PATH=/usr/lib/ccache:$PATH make ARCH=arm64 
CROSS_COMPILE=aarch64-linux-gnu- bindeb-pkg -j32

and the symptoms are the same: black screen at start-up.

I've also checked the serial console (which is confirmed to work if I
boot 5.16.18-1), and I'm not getting anything there either, with either
5.17~rc3-1~exp1 or my local v5.18-rc4-192-g38d741cb70b3 build.


Cheers,
-- 
Cyril Brulebois -- Debian Consultant @ DEBAMAX -- https://debamax.com/


signature.asc
Description: PGP signature


Bug#1010365: linux: failure to boot on Raspberry Pi Compute Module 4 (black screen)

2022-04-29 Thread Cyril Brulebois
Source: linux
Version: 5.17.3-1
Severity: important
X-Debbugs-Cc: raspi-firmw...@packages.debian.org

Hi,

In the process of testing patches for the Raspberry Pi Compute Modules
(CM3 and CM4), for bullseye[1][2] and bookworm[2], I discovered that
bookworm images don't boot on the CM4.

 1. https://bugs.debian.org/1010317
 2. https://bugs.debian.org/996937

The usual start-up rainbow is displayed, the screen turns to black and
nothing happens. My first stop was trying to downgrade the bootloader
(shipped by the raspi-firmware package) to the bullseye's version, but
that didn't help.

Then I moved to starting from a bullseye image (which boots), upgrading
the raspi-firmware package, that still boots.

Then I deployed 5.16.18-1 (from snapshot.debian.org), that still boots.

Then I deployed 5.17.3-1, and it broke booting.

I'll try and pinpoint when it broke using the various intermediary
versions:

 - 5.17~rc3-1~exp1
 - 5.17~rc4-1~exp1
 - 5.17~rc5-1~exp1
 - 5.17~rc6-1~exp1
 - 5.17~rc7-1~exp1
 - 5.17~rc8-1~exp1
 - 5.17.1-1~exp1

and then try to figure out what broke exactly. Contrary to my earlier
efforts to introduce support for that hardware a few months ago, I
haven't been following upstream changes recently, so I'll need to catch
up.


Cheers,
-- 
Cyril Brulebois -- Debian Consultant @ DEBAMAX -- https://debamax.com/