Bug#870185: FATAL: kernel 4.11.0-0.bpo.1-marvell does not boot on QNAP TS-219P II
On Sun, 2017-08-13 at 11:56 +0200, Robert Schlabbach wrote: > Von: "Ian Campbell" > > There is one other option, which is to ask people to adjust their > u- > > boot boot scripts as Robert has done, however the QNAP systems are > > often run headless and without access to the serial console (it's a > > special cable which only a minority of users will have access to) > so > > that really is a last resort. > > Note that it is possible to modify the u-boot environment without the > serial console, using the "fw_setenv" command in a running debian > system. Given a correct /etc/fw_env.config, yes. I think the TS-xxx ones are pretty well known although I don't think we ship any anywhere and we certainly don't install one automatically. > So one possibility would indeed be to modify the flash-kernel scripts > to use fw_printenv, "parse" the environment to detect affected systems > and, if needed, use fw_setenv to make the necessary changes. > > That's not really a "pretty" solution, though, and any bugs in that > function could easily brick the device. Then again, there have been > "bricking" changes in the past, so it wouldn't be an entirely new > risk ;-) Do you know if this particular brickage is undone by the recovery procedure (http://www.cyrius.com/debian/kirkwood/qnap/ts-219/recovery/) . I think one of the mtd devices which is backed up is the boot loader config (mtd4) so I think the answer must be yes, but I've not tried it. Ian. > At least making the change _without_ flashing a new kernel should not > be harmful as the moved initramfs location appears to be backwards > compatible (though I've tried only 4.9 in practice). > > Best regards, > -Robert >
Bug#870185: FATAL: kernel 4.11.0-0.bpo.1-marvell does not boot on QNAP TS-219P II
Von: "Ian Campbell" > There is one other option, which is to ask people to adjust their u- > boot boot scripts as Robert has done, however the QNAP systems are > often run headless and without access to the serial console (it's a > special cable which only a minority of users will have access to) so > that really is a last resort. Note that it is possible to modify the u-boot environment without the serial console, using the "fw_setenv" command in a running debian system. So one possibility would indeed be to modify the flash-kernel scripts to use fw_printenv, "parse" the environment to detect affected systems and, if needed, use fw_setenv to make the necessary changes. That's not really a "pretty" solution, though, and any bugs in that function could easily brick the device. Then again, there have been "bricking" changes in the past, so it wouldn't be an entirely new risk ;-) At least making the change _without_ flashing a new kernel should not be harmful as the moved initramfs location appears to be backwards compatible (though I've tried only 4.9 in practice). Best regards, -Robert
Bug#870185: FATAL: kernel 4.11.0-0.bpo.1-marvell does not boot on QNAP TS-219P II
On Sat, 2017-08-12 at 21:49 +0100, Ben Hutchings wrote: > On Mon, 2017-07-31 at 18:10 +0200, Robert Schlabbach wrote: > > Ok, I figured it out. I noticed that the 4.11 kernel has a more > > "generous" memory layout than the 4.9 one: > > > > kernel 4.9: > > > > [0.00] Memory: 504492K/524288K available (3777K kernel code, 371K > > rwdata, 1128K rodata, 296K init, 247K bss, 19796K reserved, 0K > > cma-reserved, 0K highmem) > > > > kernel 4.11: > > > > [0.00] Memory: 502648K/524288K available (4096K kernel code, 398K > > rwdata, 1132K rodata, 1024K init, 248K bss, 21640K reserved, 0K > > cma-reserved, 0K highmem) > > > > So I suspected that the 4.11 kernel might be overwriting/corrupting > > the initrd.img provided in memory before it gets to unpack it, and > > changed the memory location from 0xa0 to 0xc0: > [...] > > Voila! It's finally booting! > > > > So, was the 4.11 kernel compiled/linked with a wrong alignment > > padding setting? Or should the bootloader environment be changed to > > permanently use the higher address for passing initrd.img to the > > kernel? > > Should this be assigned to flash-kernel? Sadly probably not. There are three relevant load addresses, the one in the uboot header added to the kernel (added by flash-kernel) and the two baked into the uboot boot script, one for the kernel and one for the initrd. In some systems there is a forth one in the uboot header on the initrd binary but the QNAP systems don't appear to use that one, the initrd in flash is the raw one. Robert is modifying the boot script load address for the initrd which flash-kernel has no control over. flash-kernel only controls the address in the kernel u-boot header and IIRC that has to match a build time constant in the kernel, so while we could perhaps coordinate a change here I don't think there would be an appropriate kernel load address which would help very much here since AIUI the conflicting addresses which cause overlaps are the boot script ones. The only thing I can think of would be simply reducing the size of the armel kernel binary. I believe Roger was already looking into that? I don't think looking into reducing the size of the initrd will help since it is loaded second in RAM. I suppose it is worth double checking that /etc/initramfs-tools/initramfs.conf is using MODULES=dep (instead of most). I think d-i arranges that automatically on these platforms so it is highly probably Robert is already using it, Robert can you confirm? Relatedly it does seem here like perhaps the kernels limit on kernel size on these platforms needs to be shrunk to take into account the boot time RAM layout considerations and not just the flash partition sizes. Roger, what do you think? There is one other option, which is to ask people to adjust their u- boot boot scripts as Robert has done, however the QNAP systems are often run headless and without access to the serial console (it's a special cable which only a minority of users will have access to) so that really is a last resort. There's also the chainloaded u-boot solution, but realistically noone appears to be working on that (me included). Ian.
Bug#870185: FATAL: kernel 4.11.0-0.bpo.1-marvell does not boot on QNAP TS-219P II
On Mon, 2017-07-31 at 18:10 +0200, Robert Schlabbach wrote: > Ok, I figured it out. I noticed that the 4.11 kernel has a more > "generous" memory layout than the 4.9 one: > > kernel 4.9: > > [0.00] Memory: 504492K/524288K available (3777K kernel code, 371K > rwdata, 1128K rodata, 296K init, 247K bss, 19796K reserved, 0K cma-reserved, > 0K highmem) > > kernel 4.11: > > [0.00] Memory: 502648K/524288K available (4096K kernel code, 398K > rwdata, 1132K rodata, 1024K init, 248K bss, 21640K reserved, 0K cma-reserved, > 0K highmem) > > So I suspected that the 4.11 kernel might be overwriting/corrupting > the initrd.img provided in memory before it gets to unpack it, and > changed the memory location from 0xa0 to 0xc0: [...] > Voila! It's finally booting! > > So, was the 4.11 kernel compiled/linked with a wrong alignment > padding setting? Or should the bootloader environment be changed to > permanently use the higher address for passing initrd.img to the > kernel? Should this be assigned to flash-kernel? Ben. -- Ben Hutchings Never put off till tomorrow what you can avoid all together. signature.asc Description: This is a digitally signed message part
Bug#870185: FATAL: kernel 4.11.0-0.bpo.1-marvell does not boot on QNAP TS-219P II
Ok, I figured it out. I noticed that the 4.11 kernel has a more "generous" memory layout than the 4.9 one: kernel 4.9: [0.00] Memory: 504492K/524288K available (3777K kernel code, 371K rwdata, 1128K rodata, 296K init, 247K bss, 19796K reserved, 0K cma-reserved, 0K highmem) kernel 4.11: [0.00] Memory: 502648K/524288K available (4096K kernel code, 398K rwdata, 1132K rodata, 1024K init, 248K bss, 21640K reserved, 0K cma-reserved, 0K highmem) So I suspected that the 4.11 kernel might be overwriting/corrupting the initrd.img provided in memory before it gets to unpack it, and changed the memory location from 0xa0 to 0xc0: Marvell>> tftpboot 0x80 C0A80802.img-4.11-bpo [...] Marvell>> cp.l 0xf840 0xc0 0x24 Marvell>> setenv bootargs earlycon console=ttyS0,115200 root=/dev/ram initrd=0xc0,0x90 ramdisk=34816 coherent_pool=1M Marvell>> bootm 0x80 ## Booting image at 0080 ... Image Name: kernel 4.11.0-0.bpo.1-marvell Created: 2017-07-30 23:17:11 UTC Image Type: ARM Linux Kernel Image (uncompressed) Data Size:2076472 Bytes = 2 MB Load Address: 8000 Entry Point: 8000 Verifying Checksum ... OK OK Starting kernel ... Uncompressing Linux... done, booting the kernel. [0.00] Booting Linux on physical CPU 0x0 [0.00] Linux version 4.11.0-0.bpo.1-marvell (debian-kernel@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 Debian 4.11.6-1~bpo9+1 (2017-07-09) [...] [0.267272] Unpacking initramfs... [0.597766] Freeing initrd memory: 9216K [...] Welcome to Debian GNU/Linux 9 (stretch)! Voila! It's finally booting! So, was the 4.11 kernel compiled/linked with a wrong alignment padding setting? Or should the bootloader environment be changed to permanently use the higher address for passing initrd.img to the kernel?
Bug#870185: FATAL: kernel 4.11.0-0.bpo.1-marvell does not boot on QNAP TS-219P II
Package: linux-image-4.11.0-0.bpo.1-marvell Version: 4.11.6-1~bpo9+1: armel I had Debian Stretch installed on my QNAP TS-219P II and already upgraded to the backports kernel linux-image-4.9.0-0.bpo.3-marvell (4.9.30-2+deb9u2~bpo8+1: armel). Upgrading that to the package specified above yielded a no-boot situation. The full output of the serial console: __ __ _ _ | \/ | __ _ _ _| | | | |\/| |/ _` | '__\ \ / / _ \ | | | | | | (_| | | \ V / __/ | | |_| |_|\__,_|_|\_/ \___|_|_| _ _ _ | | | | | __ ) ___ ___ | |_ | | | |___| _ \ / _ \ / _ \| __| | |_| |___| |_) | (_) | (_) | |_ \___/|/ \___/ \___/ \__| ** LOADER ** ** MARVELL BOARD: DB-88F6282A-BP LE TS-219P2+ ,PHY=1.8v U-Boot 1.1.4 (Jan 3 2012 - 14:49:37) Marvell version: 3.5.3 U-Boot code: 0060 -> 0067FFF0 BSS: -> 006CD5C0 Soc: MV88F6282 Rev 1CPU running @ 2000Mhz L2 running @ 500Mhz SysClock = 500Mhz , TClock = 200Mhz DRAM (DDR3) CAS Latency = 7 tRP = 7 tRAS = 20 tRCD=7 DRAM CS[0] base 0x size 256MB DRAM CS[1] base 0x1000 size 256MB DRAM Total size 512MB 16bit width Addresses 8M - 0M are saved for the U-Boot usage. Mem malloc Initialization (8M - 7M): Done [16384kB@f800] Flash: 16 MB CPU : Marvell Feroceon (Rev 1) USB 0: host mode PEX 0: PCI Express Root Complex Interface PEX interface detected Link X1 PEX 1: PCI Express Root Complex Interface PEX interface detected Link X1 Reset IDE: Marvell Serial ATA Adapter Integrated Sata device found Net: egiga0 [PRIME] Hit any key to stop autoboot: 0 QNAP: Recovery Button pressed: 0 Marvell>> boot Send Cmd : 0x68 to UART1 ## Booting image at 0080 ... Image Name: kernel 4.11.0-0.bpo.1-marvell Created: 2017-07-30 18:55:56 UTC Image Type: ARM Linux Kernel Image (uncompressed) Data Size:2076472 Bytes = 2 MB Load Address: 8000 Entry Point: 8000 Verifying Checksum ... OK OK Starting kernel ... Uncompressing Linux... done, booting the kernel. That's it, no further output, no boot. The kernel seems to die early on. I've observed such early failures on another platform when a "bad" device tree binary was flashed. Is it possible that kernel 4.11 requires a changed dtb, and that was not correctly upgraded?