Bug#1059957: debian-installer: please make the netboot fw.img.gz files reproducible

2024-01-03 Thread James Addison
Source: debian-installer
Version: 20230607+deb12u4
Severity: wishlist
User: reproducible-bui...@lists.alioth.debian.org
Usertags: randomness
X-Debbugs-Cc: reproducible-b...@lists.alioth.debian.org, rclo...@rclobus.nl, 
alpernebiya...@gmail.com

Dear Maintainer / Hi Cyril,

I'm an occasional contributor to the Reproducible Builds[0] project, and
recently noticed that the debian-installer package failed some automated
reproducible build tests[1].


Analysis:

In particular, the checksums (MD5SUMS and SHA256SUMS) for some of the firmware
files provided for netboot and suffixed .img.gz are varying between builds.

Reading the diffoscope output (which performs a diff within the decompressed
contents) shows that the .img files tend to have eight bytes of randomized
content shortly after hex address 01b0 in each file.

I'm reasonably confident that the eight-byte groups are FAT serial numbers (aka
volume IDs), which mkfs.msdos (as used in the gen-hd-image[2][3]) will choose
unless it is configured not to.


Suggestions:

Good news: there's a canonical fixed FAT32 volume-id already in use[4], with
the value 'deb1' (eight bytes hex) that we can reuse.

So, adding '--invariant -i 0xDEB1' or similar to the commandline for the
mkfs.msdos calls should resolve the problem.


Existing work:

Please note that Alper (cc'd) has an existing merge request that addresses this
and a few other reproducibility-related items:

  https://salsa.debian.org/installer-team/debian-installer/-/merge_requests/38


Regards,
James

[0] - https://www.reproducible-builds.org

[1] - 
https://tests.reproducible-builds.org/debian/rb-pkg/bookworm/arm64/diffoscope-results/debian-installer.html

[2] - 
https://salsa.debian.org/installer-team/debian-installer/-/blob/20230607+deb12u4/build/config/arm64/netboot.cfg#L27

[3] - 
https://salsa.debian.org/installer-team/debian-installer/-/blob/20230607+deb12u4/build/util/gen-hd-image#L356

[4] - 
https://salsa.debian.org/installer-team/debian-installer/-/blob/20230607+deb12u4/build/util/efi-image#L200



Bug#1035392: installation-reports: Installation Report: Bookworm RC2: Raspberry Pi 400 (netboot)

2023-11-15 Thread James Addison
Followup-For: Bug #1035392
Control: close -1

(closing; I'll likely re-attempt an install on the same hardware with a more
recent release of Debian in future, for comparison purposes)



Bug#952450: user-setup: set SYSTEMD_SULOGIN_FORCE=1 in env for rescue/emergency.service when root account is locked

2023-10-10 Thread James Addison
Package: user-setup
Followup-For: Bug #952450
Control: forwarded -1 
https://salsa.debian.org/installer-team/user-setup/-/merge_requests/6



Bug#849400: debian-installer: LUKS on rootfs and boot

2023-07-29 Thread James Addison
Thanks Jinesh - one question in particular inline below:

On Sat, 29 Jul 2023 at 01:29, Jinesh Choksi  wrote:
>
>
> > Can you provide a series of steps to replicate the failure case reported in
> > this bug?
>
>
> Reproduction Steps
>
> - Boot using debian-12.1.0-amd64-netinst.iso in a VM
>
> - At the Grub boot menu, select Advanced options > Expert Install
>
> - Go through the following install steps using defaults or as desired:
>
>   - Choose language
>   - Configure the keyboard
>   - Detect and mount installation media
>   - Load installer components from installation media
>   - Detect network hardware
>   - Configure the network
>   - Set up users and password
>   - Configure the clocks
>   - Detect disks
>
> - When you reach the "Partition disks" step, choose "Manual" disk 
> partitioning method
>
> - Setup a GPT partition table
>
> - Setup an EFI partition (min 100MB), mounted as /boot/efi
>
> - Setup a DMCRYPT partition using remaining free space. (i.e. use as 
> "Physical volume for encryption")
>
> - Choose to "Configure encrypted volumes"
>
> - Set a password for the encrypted volume (also to speed up the process, set 
> Erase data to: No)
>
> - At this point, switch to TTY2, activate console and type in (the following 
> is needed as it is not possible to select luks version):
>
>   - cryptsetup luksClose sda2_crypt
>
>   - cryptsetup luksFormat --type luks1 /dev/sda2
>
>   - cryptsetup luksOpen /dev/sda2 sda2_crypt

Does replicating this issue require steps where the user switches to a
virtual terminal / other TTY?

I don't see that mentioned in other previous thread comments.  The
debian-installer team shouldn't support workflows that require use of
custom commandline steps.

> - Switch back to TTY1 and select "Go back", and select "Detect Disks" (needed 
> to refresh partman's state)
>
> - Select "Partition Disks" again
>
> - Set the file system for the encrypted volume to "XFS" (i.e. use as XFS 
> journaling file system) and set the mount point to /.
>
> - To reduces reproduction steps, we won't set up a swap partition.
>
> - Finally, select "Finish partitioning and write changes to disk"
>
> - You will see a dialog saying:
>
>   Encryption configuration failure
>
>   You have selected the root file system to be stored on an encrypted 
> partition. This feature requires a separate /boot partition on which the 
> kernel and initrd can be stored.
>
>   You should go back and setup a /boot partition.
>
>  
>
> - It is not possible to get past this dialog.
>
> - Note: If it was possible to get past this dialog, then you can proceed with 
> installation as per normal until you get to the "Install Grub Boot Loader" 
> stage. You will find that this stage errors at the "grub-install (dummy)" 
> step.
>
> - If you look at msgs on TTY4, you will note it says to add the line 
> "GRUB_ENABLE_CRYPTODISK=y" to the /etc/default/grun file. So, switch console 
> on TTY2 and edit /target/etc/default/grub file and add this line.
>
> - Run the "Install Grub Boot Loader" stage again and it will work and rest of 
> the install will progress normally.
>
> - The missing "GRUB_ENABLE_CRYPTODISK=y" line is a seperate bug #925134.



Bug#849400: debian-installer: LUKS on rootfs and boot

2023-07-28 Thread James Addison
Package: debian-installer
Followup-For: Bug #849400
X-Debbugs-Cc: jin...@onelittlehope.com

Hi Jinesh,

Can you provide a series of steps to replicate the failure case reported in
this bug?

I'll try to find time within the next two weeks to confirm the results that you
and others have seen here, and to check what we can do from the points of the
code that you linked to.

Thank you,
James



Bug#952450: user-setup: set SYSTEMD_SULOGIN_FORCE=1 in env for rescue/emergency.service when root account is locked

2023-06-03 Thread James Addison
Followup-For: Bug #952450
X-Debbugs-Cc: 1035...@bugs.debian.org, ty...@mit.edu

As an experiment, I recently updated a functional Debian bookworm system to
boot into the systemd 'rescue.target' by default, to test the single-user /
recovery experience as part of #1035543 bug assessment.

My understanding from the relevant manual[1] is that 'emergency.target' is a
similar, albeit even more basic systemd state that is automatically selected
if early boot preconditions fail and/or when serious errors occur.

The system used for testing has a locked root user account, but is essentially
a single-user environment, as I think is typical for many individually-operated
laptops, smartphones and other consumer computing devices.

There are various considerations to balance here, and because some of those
are context/usage-specific, I agree with Raphaël that a debconf question to
figure out the intended behaviour would make sense.  My understanding of it is
something like: "when your system breaks for some reason, are you ok with the
next person who reboots it -- yourself or anyone else -- being able to access
the contents and potentially attempt recovery?"

Most of my experience with that scenario has been that either I or some other
process has broken my computer, and I'd generally much prefer to be able to get
to a recovery prompt without having to use other more time-consuming methods
like removing the disk or finding other ways to get back into the system; but I
can understand that those kind of choices vary person-to-person and over time.

[1] - https://manpages.debian.org/bullseye/systemd/systemd.special.7.en.html


Bug#694154: debian-installer: Preseeding isn't possible for partman-crypto (encrypted LVM)

2023-06-01 Thread James Addison
Source: partman-crypto
Followup-For: Bug #694154
Control: fixed -1 partman-crypto/77

It looks like this was resolved[1] in partman-crypto version 77 (Debian bug
#656710 / Ubuntu launchpad issue #546405 for the same) - can we close this
bugreport?

[1] - 
https://salsa.debian.org/installer-team/partman-crypto/-/commit/be0a3afab31ba7a174047289c3aa5df179c6a794



Bug#651280: don't allocate all available disk space in standard LVM partioning scheme

2023-05-31 Thread James Addison
On Wed, 31 May 2023 at 16:38, Cyril Brulebois  wrote:
>
> Control: severity -1 wishlist
>
> James Addison  (2023-05-31):
> > After the changes made to address bug #924301 (mountpoints for ext[n]
> > filesystems that have insufficient free blocks are not automatically
> > checked for faults), I think that this bug could be considered more
> > serious.
>
> How do you figure?

Previously, after installation without enough free blocks, system
administrators would be notified (perhaps repeatedly) about lack of
space encountered by each e2scrub run.

For installations after #924301 the administrator is less likely to be
aware of the problem (the alarm was silenced, but the cause had not
been addressed).

In either case, recoverable filesystem errors could occur on the
installed system -- the difference is that in the former case, the
administrators are more likely to have been aware (and at an earlier
point in time) about the risk.

> > The disk space required for e2scrub[1] snapshots is 256MiB and the
> > default allocation for LVM (encrypted or unecrypted) in the bookworm
> > RC4 installer is 100% (same as originally reported here in Y2011).
>
> That's the default setting. Users who want to use e2scrub can tweak it.

The volume group allocation size can be adjusted during an interactive
install session, yep - the operator is prompted to input a size, and
the default value is the full extent of the block device (my
terminology may be a bit wonky).

(the 256MiB requirement appears to static, though - it's a fixed size
for exactly one snapshot, I suppose)



Bug#651280: don't allocate all available disk space in standard LVM partioning scheme

2023-05-31 Thread James Addison
Package: debian-installer
Followup-For: Bug #651280
X-Debbugs-Cc: debian-boot@lists.debian.org, skirpic...@gmail.com
Control: severity -1 serious

After the changes made to address bug #924301 (mountpoints for ext[n]
filesystems that have insufficient free blocks are not automatically checked
for faults), I think that this bug could be considered more serious.

The disk space required for e2scrub[1] snapshots is 256MiB and the default
allocation for LVM (encrypted or unecrypted) in the bookworm RC4 installer
is 100% (same as originally reported here in Y2011).

One-of-two potentially-relevant looking source code areas is 
https://sources.debian.org/src/partman-auto-lvm/91/lib/auto-lvm.sh/
And the second-of-the-two is 
https://sources.debian.org/src/partman-partitioning/147/lib/resize.sh/?hl=144#L135

[1] - https://manpages.debian.org/bullseye/e2fsprogs/e2scrub.8.en.html



Bug#1030519: hw-detect: firmware file path handling is fragile

2023-05-31 Thread James Addison
Source: hw-detect
Followup-For: Bug #1030519
X-Debbugs-Cc: a.dalm2...@googlemail.com

Hi Alexander,

I've been reviewing your patch and would like to suggest extracting the
following changes from it to consider and apply individually:

  1. Supporting firmware filenames that contain spaces.

  2. Removing (or at least reducing) the 5s wait[1] for USB devices to settle.

  3. Refactoring the fwfile 'for' loop[2] to use less-complicated parameter
 expansion (your changes didn't modify this but did highlight it
 potentially more complicated than necessary).

Each of these changes would require some description and a small patch -
writing those may require more time, I admit; the reward is that it makes it
easier for the maintainer to accept the changes.


During review, I considered these as possible other changes:

  * Loading the 'vfat' kernel module before mountpoint search.

  * Consulting the 'maybe-usb-floppy' mountmedia device as an origin.

However, it seems that mountmedia already handles these?

  https://sources.debian.org/src/mountmedia/0.26/mountmedia/?hl=20#L69
  https://sources.debian.org/src/mountmedia/0.26/mountmedia/?hl=20#L20

Thank you!
James

[1] - https://sources.debian.org/src/mountmedia/0.26/mountmedia/?hl=20#L82

[2] - 
https://sources.debian.org/src/hw-detect/1.159/check-missing-firmware.sh/#L210



Re: Bug#1029843: Missing symlinks for RPi 4 (to brcmfmac43455-sdio.raspberrypi,4-model-b.txt)

2023-05-08 Thread James Addison
On Mon, 8 May 2023 at 14:57, Diederik de Haas  wrote:
>
> On Monday, 8 May 2023 14:08:14 CEST James Addison wrote:
> > On Mon, 1 May 2023 11:18:03 +0100, James Addison  
> > wrote:
> > > > Diederik de Haas  (2023-04-30):
> > > > > And that's exactly what happens or will happen. Even though the RPi4
> > > > > filename doesn't contain spaces, there are several in the `brcm`
> > > > > directory that do. I didn't check other directories, but I'd expect
> > > > > that filenames with a space is NOT an anomaly.
> > >
> > > Since more files with that pattern are appearing upstream in
> > > linux-firmware.. yes, slightly reluctantly it does seem that this will
> > > be needed.
> >
> > FWIW: After learn the root cause of the spaces-in-filenames problem for
> > packages derived from linux-firmware.git -- that is, the contents of the
> > 'WHENCE' file in linux-firmware.git -- in fact the RPi4 is the only
> > affected[1] firmware currently.
>
> Triggered by your statement, I did a VERY crude search for spaces
> in "File: " or "Link: " lines in the WHENCE file:
>
> diederik@bagend:~/dev/kernel.org/linux-firmware$ grep -E "^Link: .* .* -> .*" 
> WHENCE
> Link: brcm/brcmfmac43455-sdio.Raspberry\ Pi\ Foundation-Raspberry\ Pi\ 4\ 
> Model\ B.txt -> brcmfmac43455-sdio.raspberrypi,4-model-b.txt
> Link: brcm/brcmfmac43455-sdio.Raspberry\ Pi\ Foundation-Raspberry\ Pi\ 
> Compute\ Module\ 4.txt -> brcmfmac43455-sdio.raspberrypi,4-model-b.txt
> Link: nvidia/gm206/acr/bl.bin  -> ../../gm200/acr/bl.bin
> diederik@bagend:~/dev/kernel.org/linux-firmware$ grep -E "^File: .* .*" WHENCE
> File: "brcm/brcmfmac43241b4-sdio.Intel Corp.-VALLEYVIEW C0 PLATFORM.txt"
> File: "brcm/brcmfmac43340-sdio.ASUSTeK COMPUTER INC.-TF103CE.txt"
> File: "brcm/brcmfmac43430a0-sdio.ONDA-V80 PLUS.txt"
> File: "brcm/brcmfmac43455-sdio.MINIX-NEO Z83-4.txt"
> File: "brcm/brcmfmac4356-pcie.Intel Corporation-CHERRYVIEW D1 PLATFORM.txt"
> File: "brcm/brcmfmac4356-pcie.Xiaomi Inc-Mipad2.txt"

Ah, okie doke.  Thanks for catching those.

> The last "Link: " line can be ignored due to being too crude ...
> but it does appear that it ONLY exists in the `brcm` directory ...
>
> > (that surprised me, but does seem to be the case.  I'm writing to counteract
> > any sense that the proposed patch[2] could affect and fix many firmwares.
> > it won't, at least not today)
> >
> > [2] -
> > https://salsa.debian.org/kernel-team/firmware-nonfree/-/merge_requests/65
>
> https://lore.kernel.org/linux-firmware/20230301-fixes-and-compression-v2-0-e2b71974e...@gmail.com/
> seems related as f.e. 1 patch deals with the inconsistent " " vs "\ ".
>
> While I was inclined based on my findings above to mark it as an anomaly,
> that patch set seems to indicate that the spaces won't be removed in
> the future, just that its use would probably more consistent.

Ok, thanks again; perhaps it's worthwhile waiting a little longer for
upstream to decide on the preferred line format(s) they'll accept.



Bug#1029843: Missing symlinks for RPi 4 (to brcmfmac43455-sdio.raspberrypi,4-model-b.txt)

2023-05-08 Thread James Addison
Package: firmware-brcm80211
Followup-For: Bug #1029843
X-Debbugs-Cc: k...@debian.org, didi.deb...@cknow.org, 
debian-boot@lists.debian.org, p...@akeo.ie

On Mon, 1 May 2023 11:18:03 +0100, James Addison  wrote:
> > Diederik de Haas  (2023-04-30):
> > > And that's exactly what happens or will happen. Even though the RPi4 
> > > filename
> > > doesn't contain spaces, there are several in the `brcm` directory that do.
> > > I didn't check other directories, but I'd expect that filenames with a 
> > > space is
> > > NOT an anomaly.
> 
> Since more files with that pattern are appearing upstream in
> linux-firmware.. yes, slightly reluctantly it does seem that this will
> be needed.

FWIW: After learn the root cause of the spaces-in-filenames problem for
packages derived from linux-firmware.git -- that is, the contents of the
'WHENCE' file in linux-firmware.git -- in fact the RPi4 is the only affected[1]
firmware currently.

(that surprised me, but does seem to be the case.  I'm writing to counteract
any sense that the proposed patch[2] could affect and fix many firmwares.  it
won't, at least not today)

[1] - 
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/WHENCE#n2708

[2] - https://salsa.debian.org/kernel-team/firmware-nonfree/-/merge_requests/65



Re: Bug#1029843: live-boot: Devices Requiring Firmware: multiple requested files in single line overlapping / special characters

2023-05-08 Thread James Addison
On Fri, 5 May 2023 at 12:52, Pete Batard  wrote:
> On 2023.05.04 14:16, James Addison wrote:
> > Yep, and for those situations, that's a point in favour of the third
> > "System Table Selection" value that I failed to mention:
> > "ACPI+Devicetree".
>
> Indeed, the firmware provides that that option as well.
>
> > I'm cautious about recommending it, given an understanding that
> > enabling ACPI could increase the amount of non-free code (ACPI Machine
> > Language, in particular) that may run on the system as a result.
>
> I think with current CPUs (especially on the x86 side with Management
> Engines as well as proprietary blobs), we're alas way past the point of
> being able to prevent non-free code from running.

Somewhat agree, yep - although I think that in this case, there should
be various paths available (u-boot, EDK2, ACPI/DT), and if possible
I'd like to understand what is the approach that provides the most
compatibility and free software support with the fewest moving parts.
To me, the reliability and human-time cost savings from simpler, more
open and straightforward systems outweigh many other factors,
especially over the long term.

That's partly the reason I've been wondering about the power
consumption and operating-system-compatibility questions: as I
understand it, those were key reasons that server hardware vendors had
a preference for ACPI when determining the server standards in the
mid-2010s, and so a few years since then, it could be helpful to
figure out whether those continue to make a difference, and how
Devicetree/FOSS driver implementations compare.

> The Raspberry Pi SoC has also some very non-free code running that
> executes prior to the running of the UEFI firmware and also in parallel
> to the UEFI firmware and OS. There's basically a Management Engine
> running on the GPU, which, among other things, provides a mailbox that
> the UEFI firmware uses to retrieve or set hardware configuration data.

Yep, I've still got quite a lot to learn about that, I think.

> On the other hand in this specific case, though I understand you're
> speaking in a more general manner about ACPI usage, the whole ACPI blob
> generation comes entirely from open source code.

I think that's great, and I've had a good experience with EDK2 +
Devicetree.  My concerns about ACPI are basically that it seems like a
less-readable, larger-surface-area standard with more opaque processes
surrounding it.  But I'm not an expert.

> As far as I'm concerned, the main reason I wouldn't advocate ACPI +
> Device Tree is that it can create user confusion as to what is really
> being used behind the scenes. A bit like when PC end-users enable Legacy
> boot in their UEFI settings and end up installing their OS in Legacy
> mode on a UEFI capable system, then find themselves in a situation where
> they want to use UEFI features but can't.

Getting slightly further off-topic, but for my own education: what is
an example of a UEFI feature that a user might want to use?

(to my mind, most of my preferences are: I want to use a standard,
easily-obtained installer, for each system install to be
straightforward and for the system to boot, for each system's devices
to function correctly, and for those to occur while minimizing the
amount of extraneous runtime code and I/O (roughly in that order of
precedence, and with FOSS practices helping a lot to achieve and
maintain those in the long-term).  perhaps ranty, but it's intended to
explain why I don't love standards that include unusual and/or
unproven quirks, and binary blobs)

> Also, again, since part of our goal has been to promote the emerging
> SBBR standard (because we think it should ultimately help making
> installation of various OSes on par with what is the case for x86 based
> PCs, where you can pretty much just create a universal boot media as
> well as have the OS install and use unified boot loaders, rather than
> force users to concern themselves about the low-level system specifics
> of their system, and play with yet another custom version of u-boot),
> and SBBR made the choice to go with ACPI only rather than
> ACPI+DeviceTree, we decided to propose ACPI+DeviceTree as a means not to
> restrict user choice for people who want to use both, but knowing that
> it's not something we really want to officially support.

Yep, that makes sense and your decision-making logic is sound.

If I were to imagine a two-line chart of device and hardware support
over time, with one line each representing the ACPI and Devicetree
approaches, then based on this standardization and industry backing, I
would expect ACPI to be the upper line for some duration of time
(decade?) until the point at which the drawbacks and disadvantages of
the extraneous functionality and complexity that I sense (but can't
really confirm, yet) outwe

Bug#1035392: Bug#1029843: live-boot: Devices Requiring Firmware: multiple requested files in single line overlapping / special characters

2023-05-04 Thread James Addison
[ replying with some re-ordering ]

On Wed, 3 May 2023 at 21:23, Pete Batard  wrote:

> Obviously, with the idea of not having ARM based device that are
> constrained to a single OS (be it Windows, Linux, BSD or something> else), 
> and considering that Windows and Device Tree don't work together,
> you want to go with a mode of operation that isn't Linux specific,
> which, even if ACPI has its drawbacks, pretty much forces the use of
> ACPI over Device Tree. Else, you have Linux going into the exact
> Microsoft strong-arm tactics that it should strive to avoid...

Yep, and for those situations, that's a point in favour of the third
"System Table Selection" value that I failed to mention:
"ACPI+Devicetree".

I'm cautious about recommending it, given an understanding that
enabling ACPI could increase the amount of non-free code (ACPI Machine
Language, in particular) that may run on the system as a result.
Perhaps there could be counterbalancing functionality benefits and/or
energy-usage savings... even then I'd recommend proceeding cautiously.

I'm not sure I'm qualified to say much about Debian's compatibility
with other operating systems on the same machine, other than to
mention that I do think it's highly compatible and that that's
something that maintainers, developers and users care about.

> On 2023.05.03 17:29, James Addison wrote:
> >* Perhaps Devicetree is a better default in EDK2 for ARM systems?
> > (that wouldn't solve the root cause, though)
>
> Please note that the reason why the Raspberry Pi UEFI firmware defaults
> to ACPI is so that this ARM device follows the relatively new ARM SBBR
> standard [3], which we (hopefully) expect more and more ARM64 based
> device to follow.

Slightly off-topic: do you know of cases where ACPI has helped a
vendor to adapt to shifting operating system interfaces or achieve
significant energy-usage savings?  I think that understanding some of
those could help to begin to address gaps that Debian/Linux/other
components have.

(and to mention why I ask.. I've been reading some of the history[1],
definitions[2], rationale[3], and state-of-support[4] around
DeviceTree and ACPI in Linux, including in relation to ARM servers.
it looks like I have a decade-or-so of history to catch up on there)

[1] - 
https://lore.kernel.org/linux-arm-kernel/CAOesGMjKeRb=ffjm0mabdihbeicgm4eqw9d5i_6-rfxtnpb...@mail.gmail.com/

[2] - https://elinux.org/Device_Tree_What_It_Is

[3] - https://www.secretlab.ca/archives/151

[4] - https://www.kernel.org/doc/html/v6.3/arm64/arm-acpi.html



Bug#1029843: brcmfmac: requested firmware filename inconsistent with linux-firmware.git on non-devicetree systems

2023-05-03 Thread James Addison
Package: src:linux
Followup-For: Bug #1029843
X-Debbugs-Cc: p...@akeo.ie, k...@debian.org, didi.deb...@cknow.org, 
debian-boot@lists.debian.org, 1029...@bugs.debian.org, 1035...@bugs.debian.org, 
989...@bugs.debian.org, debian-...@lists.debian.org
Control: retitle -1 brcmfmac: requested firmware filename inconsistent with 
linux-firmware.git on non-devicetree systems

Thanks, Pete.

I added a note[1] on the rpi4-uefi.dev GitHub repository, and from one of your
fellow contributors' responses, it seems that in fact the filename-with-spaces
format _is_ referenced from linux-firmware.git, in a 'WHENCE' file that is
used to create symlinks (I hadn't been aware of that previously).

As a result: I feel that maybe this bugreport is not valid.


I suppose that some of the confusion stemmed from the fact that a single binary
of a kernel module in combination with a single physical hardware device probed
different firmware filenames at runtime depending on the context (ACPI vs
Devicetree, in this case).

(it's code, so yep, I get that it's technically _possible_ for that to happen,
and perhaps it's useful to workaround limitations of existing standards, but
it's not clear to me whether that's necessary here)

> Note that, in case you think there may be something that we can improve 
> in the SMBIOS data reported by the UEFI firmware (which is currently 
> generated from the source code at [1], with the full output from a 
> Raspberry Pi 4, from UEFI Shell's smbiosview command at [2]) we can look 
> into updating the UEFI firmware to alter the data we output.

Thank you - I'll take a look at those to learn more.

[1] - https://github.com/pftf/RPi4/issues/76#issuecomment-1533295773



Bug#1029843: brcmfmac: requested firmware filename inconsistent with linux-firmware.git on non-devicetree systems

2023-05-03 Thread James Addison
Control: unmerge 1029843 1030519
Control: reassign 1029843 src:linux
Control: retitle 1029843 brcmfmac: requested firmware filename
inconsistent with linux-firmware.git on non-devicetree systems
Control: affects 1029843 firmware-brcm80211 raspi-firmware

Dear Maintainer,

This bugreport relates to the brcmfmac kernel module and the firmware
filename that it probes for during load.

It looks like this may have been a cause of some problems reported in
bugs #989593, #1029843 (this bug) and #1035392.

The last of those three bugs is an installation-report of mine, and as
far as I can tell the problem is that when the affected system (an
RPi) was configured without devicetree support in its UEFI bootloader,
the kernel module was unable to determine a precise filename and used
some fallback logic to determine one approximately here:

https://sources.debian.org/src/linux/6.1.25-1/drivers/net/wireless/broadcom/brcm80211/brcmfmac/common.c/?hl=487#L487

Please let me know if I can provide any further details to help track this down.

Thank you,
James



Bug#1029843: live-boot: Devices Requiring Firmware: multiple requested files in single line overlapping / special characters

2023-05-03 Thread James Addison
On Wed, 3 May 2023 at 16:49, Cyril Brulebois  wrote:
> James Addison  (2023-05-03):
> > After editing and rebuilding the Device Tree (DTS) files, and
> > deploying those changes to the system, I can confirm that adjusting
> > the 'model' field value in there has no effect on the requested fw
> > filename.
>
> Did those modifications stay in place once you switched to Device Tree
> in your bootloader configuration? Just wondering whether you tested two
> cumulative changes (DTS tweaks + switch from ACPI), or independent ones
> (DTS tweaks, then switch from ACPI but using pristine DTB files).

That was tested cumulatively, yep - applied the DTS tweaks, found no
change, and then updated the UEFI setting from ACPI to Devicetree,
resulting in the expected fw requests.

Since then I've reverted the DTS tweaks, and the correct behaviour continues.

After that I reverted the UEFI setting back to ACPI briefly.  I forget
what I was checking, but the problem reappeared, and now I'm back to
the expected behaviour (no spaces in the fw filenames) with the
Devicetree setting.

> I don't have any Pi 400 and haven't been following what the stock
> configuration is (and sorry I didn't read the whole backstory)… if EDF2
> UEFI comes by default, or is recommended, or fixes/works around bugs,
> and in the end is expected to be relevant and widely used, and if ACPI
> is indeed some kind of default setup, it would be best if we were to
> support that.

I'd defer to people with more familiarity of the ecosystem on these
kind of questions, although am also keen to learn more.

Some thoughts:

  * Maybe there's a bug to report in the upstream Linux brcmfmac
driver here; are there better alternatives than DMI that it could use
to determine fallback filenames?  (again, I'm not sure, but can follow
up on that)
  * Perhaps Devicetree is a better default in EDK2 for ARM systems?
(that wouldn't solve the root cause, though)
  * Alternatively, if EDK2/RPI4-UEFI became Debian-packaged (excluding
the brcm firmware, because that's already in raspi-firmware), _and_
could be installed on the ESP partition by d-i (it's beyond my
experience to say whether that's a good idea...), then perhaps it
could be preconfigured (or d-i could request) Devicetree at
install-time.

If & when attempting this again - it could be a while, this took some
energy - then I would probably experiment with u-boot as a comparison.

> I suppose that would mean either having the relevant files/symlinks in
> the firmware package *and* d-i support for it (hw-detect limitations…);
> or have some on-the-fly conversion in the Linux module so that it ends
> up requesting files that are actually in the firmware package, and that
> d-i can work with, without requiring any changes?

At the moment, I think that fixing this in the brcmfmac driver would
resolve the problem in a bootloader-agnostic way, so that seems worth
exploring.

Symlinks seem like they'd be a reasonable short-term workaround in our
packaging - but likely maintainer discretion on that one, as usual (if
that were me, it would help to know that it would be a temporary
measure while other problems were being resolved).

(sorry for what I think may be an overly-large reply list here - I'd
prefer that than to have anyone miss some hopefully-relevant details)



Bug#1035392: Bug#1029843: live-boot: Devices Requiring Firmware: multiple requested files in single line overlapping / special characters

2023-05-03 Thread James Addison
Mystery may be (partially) solved.  Responses inline below.

On Wed, 3 May 2023 at 15:17, Diederik de Haas  wrote:
>
> On Wednesday, 3 May 2023 03:41:05 CEST James Addison wrote:
> > I think that the vendor name is coming from a DMI fallback:
> > ...
> > https://sources.debian.org/src/linux/6.1.25-1/arch/arm/boot/dts/bcm2711-rpi-400.dts/#L7
>
> AFAIK the most important thing is the "compatible" string.
> Next to "DMI" there was also mention of "OF", which stand for Open Firmware
> which is (essentially?) the same as Device Tree
> ...
> I'd double check whether you actually see that line in your own dmesg output,
> before spending time to find some logic in that name.

Agreed.  The 'compatible' string is what seems intended for inclusion
into the fw filename request - by the driver authors,
linux-firmware.git, ourselves, and the RPi operating system.

After editing and rebuilding the Device Tree (DTS) files, and
deploying those changes to the system, I can confirm that adjusting
the 'model' field value in there has no effect on the requested fw
filename.

The system's dmesg includes this line:

  DMI: Raspberry Pi Foundation Raspberry Pi 400/Raspberry Pi 400, BIOS
UEFI Firmware v1.34 1 2/16/2022

As Cyril said though.. this can't (shouldn't) be genuine DMI.  So
what's going on?

It seems that the cause may be this:

The default settings within the EDK2 UEFI, under "Device Manager" ->
"Raspberry Pi Configuration" -> "Advanced Configuration" contained a
key labeled "System Table Selection" that was set to "ACPI".  Changing
that value to "Devicetree" and then booting caused the correct,
expected fw filename to be requested:

  firmware: failed to load brcm/brcmfmac43456-sdio.raspberrypi,400.bin (-2)

> Probably my fault, but I don't think it's relevant "what we agree to".
> We need to use what the kernel uses of which I suspect there's a (strong)
> correlation with what's used in the linux-firmware upstream repo.

Agreed here too - we can (and should) only make adjustments that are
acceptable and compatible for adjacent components.  I think we're
aligned with upstream here, it's simply that some other component --
and at the moment, that appears to me to be the EDK2 UEFI -- is
producing an unusual effect.

Cheers,
James



Bug#1029843: live-boot: Devices Requiring Firmware: multiple requested files in single line overlapping / special characters

2023-05-03 Thread James Addison
On Wed, 3 May 2023 at 03:02, Cyril Brulebois  wrote:
> James Addison  (2023-05-03):
> > I think that the vendor name is coming from a DMI fallback:
> > https://sources.debian.org/src/linux/6.1.25-1/drivers/net/wireless/broadcom/brcm80211/brcmfmac/common.c/?hl=487#L487
>
> I smiled. :)
>
> The (still quick) glance I had earlier stopped when I reached “DMI”
> which I wasn't sure was valid for ARM devices. :)

Yep, it's very possible that I'm on the wrong track with that thought
(can you tell I don't know much about device drivers and hardware?
:)).

The vendor-string only appears once[1] in src:linux, and that's in a
YAML file that I don't see referenced within the build.  But I suppose
there are other places (nvram?) where the value could have originated
from.  My hope is that there's a relatively-hardcoded value somewhere
that we can read from, so that we don't have to worry about any
configuration changes applied locally per-system.

Either way: ignore me for a while, I'll go and use the instructions
you provided later in your message to try to figure out the source of
these values (thank you!).

[1] - 
https://sources.debian.org/src/linux/6.1.25-1/Documentation/devicetree/bindings/vendor-prefixes.yaml/?hl=1057#L1057



Bug#1035349: regression: 'hostname' preseed alias for netcfg/get_hostname takes precedence over DHCP hostname

2023-05-03 Thread James Addison
On Wed, 3 May 2023 at 04:03, Cyril Brulebois  wrote:
> James Addison  (2023-05-01):
> > On Mon, 1 May 2023 at 17:53, Cyril Brulebois  wrote:
> > I do see that guestfs-tools references[1] them, and I suppose other
> > downstream software could do as well.  But within the installer's
> > components, I don't think that they have any special meaning.
>
> Thanks for mentioning guestfs-tools, we have other occurrences of that
> particular setting in various packages:
>   https://codesearch.debian.net/search?q=unassigned-hostname
>
> As usual (not the first time you'll see me write this, not the last one):
> a quick glance suggests those are mostly used inside preseed files, not on
> the command line, for netcfg/get_hostname or netcfg/hostname, though.

Agreed, I think.

Rephrasing it, to check: 'unassigned-hostname' (and similarly,
'unassigned-domain') commonly appears as an input to preseed
configuration (usually as a placeholder or sample value).  A limited
number of tools may have code that compares against that value, but
those should be rare.

> > > I have some pending yet unrelated things to investigate on the preseed
> > > side; I'm not sure I'll want to be testing each and every possible
> > > combination (esp. tweaking the configuration of the DHCP server behind
> > > the virtualization layer), but I should be able to test the water.
> >
> > Totally reasonable, yep.
>
> OK, thanks for the sanity check.
>
> > Currently I think that a relevant patch should:
> >
> >   * Unset the hostname, or set the hostname to '(none)', so that the
> > installer's netcfg ignores[2] and is unaware of an install-time
> > hostname.
>
> Yes.

Thanks - your patch in src:preseed to do that looks good.

> >   * Within env2debconf, attempt to find a hostname specified on the
> > kernel command-line:
> > ...
>
> FWIW: This kind of heavy headscratching is exactly why I was wondering
> whether to fix #1031643 in the first place. :) Spoiler alert though, I
> don't think it's actually that complicated, thanks to Andreas and the
> clever side-stepping of the entire problem.
>
>
> Call me an optimist (famous last words…), but isn't that sufficient?
>  - hostname=foo case:
> + The kernel consumes the parameter, acts on it.
> + We detect it in env2debconf (the hostname isn't “(none)”) and I
>   guess we set the variable as today, but unset the hostname in
>   Linux/UTS.
> + We get all the rest of the logic as previously?
>  - hostname?=foo case:
> + The kernel doesn't consume the parameter, so there's no change
>   from previous situation.
> + Things are as they have always been?

Perfect, I think that's true.

I'd _noticed_ Andi's mention of the 'hostname?=' testing.. hadn't
properly parsed or understood what it meant (rephrasing again: it's
not a recognized kernel parameter, so it's unaffected).

Thanks for the recap, and that reassures me (mostly!) that this is
safe.  I'll continue to try to think of edge cases, but hopefully will
only spend my own time by doing that.

A probably-irrelevant thought: the fixes applied for  #1031643 and
#1035349 also seem like they should be backwards-compatible with
pre-6.x kernels.

> I don't think before/after --- changes anything there (see my initial
> report, filed on Salvatore's behalf, and the red herring section there)
> and I clearly don't see why one would prefer a specific one anyway.
>
> A bit of warning: if one opens up a shell right after passing the
> bootloader (e.g. the language screen is still shown, and the network
> screen hasn't been reached yet – let's keep things simple), one won't
> see hostname=foo there; but one will see vga=788 (assuming graphical
> installer). That's probably because this particular env is inherited
> from the kernel, who ate the variable away already. That doesn't mean
> it's not taken into account, as can be checked in /var/lib/preseed/log:
>
> d-i netcfg/get_hostname string foo
>
> IOW: It has been created within the env2debconf process, acted on, but
>  it's not going to show up in other places, like in a random shell.

Understood.

And in particular: yes, the hostname variable that we're creating (and
then reading-back by invoking 'set') is local to the env2debconf
script, and that's good.

> In summary:
>  - It looks to me the first patch did make sure hostname=foo is still
>seen and acted on in userland, using the traditional logic.
>  - It looks to me tweaking it to unset the hostname if it's set should
>restore “hostname is only a fallback, not actually taking priority”
>problem, while retaining the “abracadebconf” part.
>  - It looks to me the kernel change should have zero impact on the
>hostname?=foo cas

Bug#1029843: live-boot: Devices Requiring Firmware: multiple requested files in single line overlapping / special characters

2023-05-02 Thread James Addison
Control: retitle -1 hw-detect: firmware file path handling is fragile
Control: clone -1 -2
Control: reassign -2 src:linux
Control: retitle -2 brcmfmac: firmware filename inconsistency with
linux-firmware.git

On Wed, 3 May 2023 at 00:34, Cyril Brulebois  wrote:
>
> James Addison  (2023-05-02):
> It looks to me someone should understand the Linux kernel code, and
> possibly where inputs/variables come from (there might be stuff coming
> from the DTB, some bootloader thing, etc.), and why spaces show up in
> there. Then decide whether the “external” source (if any!) or the kernel
> should be adjusted to use more straightforward names.

Thanks - OK.


I think that the vendor name is coming from a DMI fallback:

https://sources.debian.org/src/linux/6.1.25-1/drivers/net/wireless/broadcom/brcm80211/brcmfmac/common.c/?hl=487#L487

Whether the model name is from DMI or from the DTS file's 'model'
field is less clear to me:

https://sources.debian.org/src/linux/6.1.25-1/arch/arm/boot/dts/bcm2711-rpi-400.dts/#L7

I'll try to rebuild the kernel module and test some changes 'soon'
(within the next few days, most likely).


Also, to clarify an error/thinko in my previous message: the style of
filename we agreed to map to, and that both linux-firmware.git and the
RPi operating system distro[1] use, is
"brcmfmac43456-sdio.raspberrypi,400.txt" (not the short-format
"brcmfmac43455-sdio.txt" that I mentioned).  We should include
specificity for vendor and model in the filename, all lowercased, and
without spaces.

The RPi 400 model firmware files are not yet represented in
linux-firmware.git, although they do appear in the RPi operating
system distro.

Thanks,
James

[1] - 
http://archive.raspberrypi.org/debian/dists/bullseye/main/binary-arm64/Packages



Bug#1035392: installation-reports: Installation Report: Bookworm RC2: Raspberry Pi 400 (netboot)

2023-05-02 Thread James Addison
Package: installation-reports
Followup-For: Bug #1035392

> The only customized dnsmasq setting required was:
>
>   pxe-service=0, "Raspberry Pi Boot"

Oops, I lied.  There was one other relevant dnsmasq setting:

  dhcp-boot=bootnetaa64.efi

(telling the device what EFI filename to retrieve and run from the TFTP server)



Bug#1029843: live-boot: Devices Requiring Firmware: multiple requested files in single line overlapping / special characters

2023-05-02 Thread James Addison
On Mon, 1 May 2023 at 20:03, Cyril Brulebois  wrote:
> James Addison  (2023-05-01):
> > Also, the brcmfmac kernel module code mentions[3] that it can load
> > board-specific firmware file paths.  I'm not yet sure whether that's
> > relevant (either now, or in future).
>
> Yeah, both the function you pointed to and the one handling actual
> firmware requests seem to know about some alt_fw semantics, with a
> fallback. But I'm not diving into that rabbit hole. :)

That's a sensible strategy :)

Could either of you (Cyril, Diederik) recommend where I should ask
(and/or clone this bug) to follow up on the firmware filename issue,
given that the filename(s) seem to be generated from the kernel
module?

(as a recap: the brcmfmac module attempts to load a file of the format
"brcmfmac43455-sdio.Raspberry Pi Foundation-Raspberry Pi 4 Model
B.txt" instead of "brcmfmac43455-sdio.txt" -- I saw the same thing
during my install, with string adjustments for brcmfmac43456 and Pi
400)

I think that that's likely to be the cause of the firmware-not-loaded
problems in installation-reports #989593 and #1035392 (that second
report is from me, earlier today).  Even with the 'Contents-firmware'
file-to-package mapping, we won't find the relevant firmware file if
the name is wrong.

> Regarding “plans for the future”, it's worth mentioning #1033921, now
> cloned as #1035356. While the former is about license acceptance for
> some firmware packages specifically (and about to be fixed for bookworm)
> the latter is for longer term, with a proposed patch changing the logic
> around iterating over firmware filenames. I'm not saying it's going to
> solve spaces-in-filenames as it is, I just thought it'd make sense to
> mention it as that touches one relevant part of the hw-detect code.

Thank you; yep, I've followed _most_ of that (and arrived back here
again).  I will admit that most of what I've cognitively loaded from
it is "this script could use refactoring post-bookworm", and have not
processed the complete details.

Regards,
James



Bug#1035392: installation-reports: Installation Report: Bookworm RC2: Raspberry Pi 400 (netboot)

2023-05-02 Thread James Addison
Package: installation-reports
Severity: normal
Tags: d-i

Boot method: network
Image version: [2023-04-28] Bookworm Release Candidate 2 Installer
Date: 2023-05-02

Machine: Raspberry Pi 400
Partitions:

Filesystem Type 1K-blocksUsed Available Use% Mounted on
udev   devtmpfs   1871816   0   1871816   0% /dev
tmpfs  tmpfs   386544 704385840   1% /run
/dev/mmcblk1p2 ext4  14689724 4192988   9728736  31% /
tmpfs  tmpfs  1932704   0   1932704   0% /dev/shm
tmpfs  tmpfs 5120   0  5120   0% /run/lock
/dev/mmcblk1p1 vfat524008  119424404584  23% /boot/efi
tmpfs  tmpfs   386540  44386496   1% /run/user/1000


Base System Installation Checklist:
[O] = OK, [E] = Error (please elaborate below), [ ] = didn't try it

Initial boot:   [O]
Detect network card:[E]
Configure network:  [O]
Detect media:   [E]
Load installer modules: [E]
Clock/timezone setup:   [O]
User/password setup:[O]
Detect hard drives: [E]
Partition hard drives:  [O]
Install base system:[O]
Install tasks:  [O]
Install boot loader:[E]
Overall install:[O]

Comments:

Although the list of problems in this report might seem lengthy and
arcane, I enjoyed the installation process and think that with a few
small fixes, the rough edges can be removed.  I am writing this report
from an LXDE environment on the installed system.


This was largely an experiment to determine how feasible it is to bring
up a Raspberry Pi 400 over the network and to install Debian Bookworm
from there using the standard Debian Installer process.

The TFTP server was dnsmasq with a fairly minimal configuration based on
Debian's PXE-boot wiki page[1].  In addition to Debian Bookworm's
netboot.tar.gz[2] file (for RC2 at the time of download), the EDK2 UEFI
firmware[3] v1.34 was extracted to the /srv/tftp directory.

The only customized dnsmasq setting required was:

  pxe-service=0, "Raspberry Pi Boot"


Additional firmware for use during the install session was provided by
unpacking a firmware.tar.gz[3] file onto a FAT32-formatted USB drive.


Problems:

  * Firmware for the brcmfmac kernel module was not found on the USB
drive (but is present).  This may be related to #1029843

* Workaround: extracted the .deb contents on another system, placed
  them onto the USB drive, and then used one of the available virtual
  consoles (ctrl-alt-F1 or ctrl-alt-F2) on the install host to mount
  the USB drive and copy the firmware files to /lib/firmware/brcm
  before rmmod'ing and modprobe'ing the kernel module.

  * [minor] The hw-detect/load_firmware dialog box included an
extraneous newline within the displayed filename(s) for which
loading failed.
 
  * The microSD card intended as the installation disk did not appear
under /dev/mmc* when the install began.

* Workaround: rmmod'd and modprobe'd the sdhci* kernel modules;
  after doing that, the disk was detected and available under /dev

  * After completing the installation and rebooting, the first boot from
the install disk failed.  The Raspberry Pi's diagnostics console
showed a 'Firmware not found' message.

* Fix: this seemed to be due to a lack of Pi-compatible firmware on
  the ESP (EFI System Partition) of the install disk.  To resolve
  the problem, the same EDK2 UEFI firmware used on the dnsmasq
  ntboot server was unpacked into the ESP partition from another
  system (by removing the SD card from the Pi and placing it into
  the other machine).

  * After successfully reaching the EDK2 UEFI boot manager, the system
appeared to pause without reaching the expected next-stage GRUB
bootloader.

* Fix: this appears to be due to the default unpacked EDK2 UEFI
  bootmanager being unaware of the GRUB install on the same ESP
  partition.  That's understandable, because GRUB was installed
  before the EDK2 UEFI.

  The problem was solved by using the built-in boot menu management
  in the EDK2 UEFI to add an entry to boot into Debian.  In
  particular, this involved creating a file-boot entry that runs
  'shimaa64.efi'.

That concludes the installer-related issues; with those problems
worked-around / resolved, the system booted correctly.

There was one more problem that may not be installer-related:

  * The 'raspi-firmware' package failed to configure correctly during
'apt install', with an exit code 1 and asking whether the
/boot/firmware path had been mounted.


[1] - 
https://wiki.debian.org/PXEBootInstall?action=show=DebianInstaller%2FNetbootPXE#Another_Way_-_use_Dnsmasq

[2] - 
https://deb.debian.org/debian/dists/testing/main/installer-arm64/current/images/netboot/netboot.tar.gz

[2] - https://github.com/pftf/RPi4/releases/tag/v1.34

[3] - 
https://cdimage.debian.org/cdimage/firmware/bookworm/20230424/firmware.tar.gz


-- Package-specific 

Bug#1035349: regression: 'hostname' preseed alias for netcfg/get_hostname takes precedence over DHCP hostname

2023-05-01 Thread James Addison
On Mon, 1 May 2023 at 17:53, Cyril Brulebois  wrote:
>
> James Addison  (2023-05-01):
> > I understand that line of thinking, but we note that we have already
> > received feedback on Salsa[1] from a user whose Bookworm installation
> > workflow has been affected, and confirmed that the reported problem
> > exists.
>
> And that user mentioned hostname=unassigned-hostname which would be
> addressed if we were to implement what I mentioned?

Yep, fair point!

> Initially it looked like specific values were expected to lead to a
> particular behaviour, but if we've been encouraging people to expect
> *any* fallback values specified there, that's indeed another story.
>
> (I had mentioned before “unassigned-hostname” wasn't to be seen in any
> packages but “unassigned-domain”/“unnassigned-domain” definitely have
> some specific handling.)

I do see that guestfs-tools references[1] them, and I suppose other
downstream software could do as well.  But within the installer's
components, I don't think that they have any special meaning.

> I have some pending yet unrelated things to investigate on the preseed
> side; I'm not sure I'll want to be testing each and every possible
> combination (esp. tweaking the configuration of the DHCP server behind
> the virtualization layer), but I should be able to test the water.

Totally reasonable, yep.  I'll try to get familiar with the process of
rebuilding the installer's initrd.

Currently I think that a relevant patch should:

  * Unset the hostname, or set the hostname to '(none)', so that the
installer's netcfg ignores[2] and is unaware of an install-time
hostname.
  * Within env2debconf, attempt to find a hostname specified on the
kernel command-line:
* The parameter may appear as a 'hostname=value', or
'hostname?=value' key=value pair.
* The parameter must appear strictly before any '---' delimiter_
in the line.
  * If a hostname was found:
* Create a local 'hostname' variable within the env2debconf'
script containing the hostname's value.
* Ensure that the 'seen' flag is assigned appropriately:
  * The value should be empty if the hostname was matched using
'hostname=value'.
  * The value should be non-empty if the hostname as matched using
'hostname?=value'.
  * If no hostname was found:
* Do nothing.

As I wrote up those criteria, they expanded and became more
complicated than I initially realized, so perhaps there could be
further hidden complexity here.  I'll do my best to prepare and test a
patch anyway.

[1] - 
https://sources.debian.org/src/guestfs-tools/1.48.3-4/customize/hostname.ml/?hl=125#L129

[2] - https://sources.debian.org/src/netcfg/1.185/dhcp.c/?hl=578#L580



Bug#1031643: preseeding hostname=foo via the kernel command line seems to be ignored

2023-05-01 Thread James Addison
Source: preseed
Followup-For: Bug #1031643

As requested, the hostname-param-ignores-DHCP regression bug has been filed
separately: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1035349



Bug#1035349: regression: 'hostname' preseed alias for netcfg/get_hostname takes precedence over DHCP hostname

2023-05-01 Thread James Addison
On Mon, 1 May 2023 at 16:00, Cyril Brulebois  wrote:
> James Addison  (2023-05-01):
> > Conditions:
> >
> >   * Preseed alias 'hostname' configured on the kernel command-line
> >   * There is a DHCP server on the installation-target's network that will 
> > provide a hostname
> >
> > Expected behaviour:
> >
> >   * Installer does not ask for installation-target hostname
> >   * Installation-target hostname is received and configured using DHCP
> >
> > Actual behaviour:
> >
> >   * [good] Installer does not ask for hostname
> >   * [bad] Hostname is configured from the command-line, ignoring the 
> > DHCP-negotiated hostname.
> >   * This is similar to 'netcfg/hostname' -- a different setting from 
> > 'netcfg/get_hostname'.
>
> Given the proximity of the tentative Bookworm release, my gut feeling
> would be to special-case the hostname=unassigned-hostname setting that's
> been documented since at least 2004, and limit “unsetting hostname” to
> this particular case.
>
> This should:
>  1. be good enough for anyone having followed the example preseed from
> any point in the past;
>  2. and equally importantly: limit possible side-effects.
>
> If that's not the case for (1), we should get bug reports with details
> about what people have actually been doing, and whether it makes sense
> to unset it unconditionally. If that's the case, we can let this thing
> mature in unstable/testing post-Bookworm, and once we're absolutely
> certain this isn't going to regress in some other weird way, we can
> consider backporting this to Bookworm, via a point release.

I understand that line of thinking, but we note that we have already
received feedback on Salsa[1] from a user whose Bookworm installation
workflow has been affected, and confirmed that the reported problem
exists.

One datapoint isn't huge, but it's non-zero - and I'd expect that any
installation using the 'hostname' preseed alias in combination with
DHCP-as-hostname-provider would be similarly affected.

The bug here is essentially that the 'hostname' alias used to provide
a fallback value, and in RC 2 d-i is used as the source of the primary
value (ignoring DHCP).  If we know that that change has taken place, I
think that we should either document it, or attempt to restore the
existing behaviour.


The possibility about introducing other regressions with any further
changes is a valid point.. I'm not sure how best to address that,
other than testing the results in various configurations.

It feels to me like 'installer begins running without its own
hostname' was likely a reasonable baseline assumption before Linux 6.0
began reading the same-named 'hostname' parameter, and so as a result
it feels like unsetting the hostname early in the installer
initialization would be safe (maybe even a good idea, to reduce a
source of input variation between install sessions).

[1] - 
https://salsa.debian.org/installer-team/installation-guide/-/merge_requests/25



Bug#1035349: regression: 'hostname' preseed alias for netcfg/get_hostname takes precedence over DHCP hostname

2023-05-01 Thread James Addison
Source: preseed
Version: 1.115
Severity: important
Tags: d-i

Dear Maintainer,

This bugreport is a subset/related-to bug #1031643, also in preseed.

When the 'hostname' preseed alias for 'netcfg/get_hostname' is provided to
Bookworm's RC 2 installer as a kernel command-line argument, the value that
it contains unexpectedly takes higher precedence over a hostname received from
DHCP, contrary to the Installation Guide documentation[1] in combination with
the corresponding netcfg documentation[2].


Conditions:

  * Preseed alias 'hostname' configured on the kernel command-line
  * There is a DHCP server on the installation-target's network that will 
provide a hostname

Expected behaviour:

  * Installer does not ask for installation-target hostname
  * Installation-target hostname is received and configured using DHCP

Actual behaviour:

  * [good] Installer does not ask for hostname
  * [bad] Hostname is configured from the command-line, ignoring the 
DHCP-negotiated hostname.
  * This is similar to 'netcfg/hostname' -- a different setting from 
'netcfg/get_hostname'.



Context:

Since Linux 6.0, a 'hostname=...' parameter provided in the kernel command-line
is no-longer loaded into the init process environment as a variable, but is
instead used to set the hostname of the running system (skipping the
need for userspace tooling to achieve that).

That caused a conflict for the preseed aliases in the Debian Installer, because
one of the aliases is also 'hostname', mapped to 'netcfg/get_hostname'.

The fix applied in #1031643 loads the 'running system hostname' into the
environment if it is non-empty and not equal to '(none)'.  This allows
unattended installs to work again.

The 'netcfg' component that determines the system hostname (prompting for it
from the operator if required) to be installed will prefer a non-empty hostname
(as long as it is not the literal string '(none)') over one provided by DHCP
in this block of code: https://sources.debian.org/src/netcfg/1.185/dhcp.c/#L578

Thanks,
James

[1] - 
https://www.debian.org/releases/stable/amd64/apbs02.en.html#preseed-aliases

[2] - 
https://sources.debian.org/src/netcfg/1.185/debian/netcfg-common.templates/?hl=145#L160



Bug#1031643: preseeding hostname=foo via the kernel command line seems to be ignored

2023-05-01 Thread James Addison
Source: preseed
Followup-For: Bug #1031643
X-Debbugs-Cc: car...@debian.org, a...@debian.org, freyerm...@physik.uni-bonn.de

Hi folks,

This is nitpicky, but I think there is an important-ish further detail to
report.

The fix applied does repopulate the 'hostname' variable so that env2debconf can
read from it and place it into the 'netcfg/get_hostname'[1] preseed variable; so
far, so good.


However: the hostname in the running Debian installer masks the intended
behaviour of 'netcfg/get_hostname', because netcfg's DHCP logic prefers[2]
to read the running system's hostname, when it is non-empty.

I've confirmed this behaviour by netbooting from the Bookworm RC 2 installer;
DHCP configuration of a hostname is ignored when the command-line hostname is
present.

(note: a similar setting, 'netcfg/hostname' is available that takes precedence
over DHCP[2] hostname values, but it's a separate setting, and is not our
documented[3] behaviour of the 'hostname' preseed alias)


Suggestions:

This was found following some related documentation discussion[4] on Salsa.  In
that discussion, Martin Samuelsson suggests a fix that I think should work:

We should (un)set the d-i system's hostname to the 'empty' hostname early in
the installer session.

That could happen in env2debconf -- or it could be placed even earlier in the
installer scripts (since it's only relatively recently that Linux 6.0 began
reading a hostname, we should be confident that d-i works OK without one
configured).

I'm doing some testing to confirm the fix currently.

Thanks,
James

[1] - https://sources.debian.org/src/netcfg/1.185/dhcp.c/#L578

[2] - 
https://sources.debian.org/src/netcfg/1.185/debian/netcfg-common.templates/?hl=145#L160

[3] - 
https://www.debian.org/releases/stable/amd64/apbs02.en.html#preseed-aliases

[4] - 
https://salsa.debian.org/installer-team/installation-guide/-/merge_requests/25



Bug#1029843: live-boot: Devices Requiring Firmware: multiple requested files in single line overlapping / special characters

2023-05-01 Thread James Addison
On Mon, 1 May 2023 at 03:00, Cyril Brulebois  wrote:
> Diederik de Haas  (2023-04-30):
> > I suggest we stick to `brcmfmac43455-sdio.raspberrypi,4-model-b.txt` as that
> > is its name in the upstream repo:
> > https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
>
> Yes please.

Agreed.

I'm trying to figure out the reason that the kernel module requests
the verbose-style firmware filename (that include spaces) in the first
place.

In the case of the RPi B, the kernel source contains a device-tree
file for the brcm2711, the relevant hardware.

The 'compatible'[1] field of the brcm2711 device-tree structure file
includes the string "raspberrypi,4-model-b" -- matching one of the
files in linux-firmware.git's brcm directory -- but the 'model'[2]
field, containing "Raspberry Pi 4 Model B", could be the one that's
used when the firmware file request is issued when the module loads
(?).

Also, the brcmfmac kernel module code mentions[3] that it can load
board-specific firmware file paths.  I'm not yet sure whether that's
relevant (either now, or in future).

> Diederik de Haas  (2023-04-30):
> > And that's exactly what happens or will happen. Even though the RPi4 
> > filename
> > doesn't contain spaces, there are several in the `brcm` directory that do.
> > I didn't check other directories, but I'd expect that filenames with a 
> > space is
> > NOT an anomaly.

Since more files with that pattern are appearing upstream in
linux-firmware.. yes, slightly reluctantly it does seem that this will
be needed.

On Mon, 1 May 2023 at 03:00, Cyril Brulebois  wrote:
> We won't rewrite hw-detect for bookworm, nor will we make it “shellcheck
> compliant”. Now is definitely not the time to deal with such things, and
> yes I'm going to call system files (e.g. package-shipped) with spaces an
> anomaly.
>
> People are much than welcome to put energy into making hw-detect more
> robust in the future, but even knowing some bits about it, it's some
> kind of a maze and I *really* don't want to lose any feature for the
> generic cases (non-crazy filenames).

I'd generally agree with all that.  For robustness, and when it's safe
to, it'd be nice to resolve both issues (firstly to ensure that the
correct firmware filename is being requested in these cases -- maybe
it already is, I'm still trying to determine whether that's a bug --
and secondly to support spaces in firmware filenames in hw-detect).

[1] - 
https://sources.debian.org/src/linux/6.1.25-1/arch/arm/boot/dts/bcm2711-rpi-4-b.dts/#L9

[2] - 
https://sources.debian.org/src/linux/6.1.25-1/arch/arm/boot/dts/bcm2711-rpi-4-b.dts/#L10

[3] - 
https://elixir.bootlin.com/linux/v6.1/source/drivers/net/wireless/broadcom/brcm80211/brcmfmac/firmware.c#L772



Bug#1029843: live-boot: Devices Requiring Firmware: multiple requested files in single line overlapping / special characters

2023-04-30 Thread James Addison
Followup-For: Bug #1029843
X-Debbugs-Cc: a.dalm2...@googlemail.com, p...@akeo.ie, 
debian-boot@lists.debian.org
Control: reassign -1 hw-detect
Control: merge -1 1030519
Control: affects -1 raspi-firmware
Control: title -1  check-missing-firmware: patch for files with space 
characters, mediamount and more (with code)

Hi Alexander,

Thanks for these reports - I was attempting a netboot of the Bookworm RC 2
installer on a Raspberry Pi and ran into the same problem: hw-detect attempting
to load firmware files that contain spaces in the filename.

I've added Pete on cc - he noticed this issue too a few years ago, while
documenting how to install regular Debian 11 on a RPi.

Quoting some notes from his guide[1]:

>  Also, if you did install the Wifi firmware blobs, you may find that you get 
> the following error during boot:
...
> To fix that, simply rename /lib/firmware/brcm/brcmfmac43455-sdio.Raspberry to 
> /lib/firmware/brcm/brcmfmac43455-sdio.Raspberry Pi Foundation-Raspberry Pi 4 
> Model B.txt.


Let's merge this with your linked bug #1030519 - I agree that the hw-detect
package seems to be the location of the relevant scripting[2].  Do you have an
account on Salsa[3] to offer fixes back to Debian git repositories?


I have some code review / comments about the patches, focusing only on the
filename problem to begin with:

Do we _need_ to retain the vendor name and model name in the firmware filename?

My guess (without being too familiar with the firmware loading process yet)
is that it'd be easier to ship a concisely-named file that omit the vendor and
model strings, so we'd want a way to map:

  brcmfmac43455-sdio.Raspberry Pi Foundation-Raspberry Pi 4 Model B.txt

To the corresponding, already-packaged[4] filename:

  brcmfmac43455-sdio.txt

...while preferably avoiding adding custom scripting for too many other
firmware filenames in future.  Where does the long-format filename originate
from?

It seems unlikely that this should be fixed for Bookworm, but I can offer some
assistance with further testing, and hopefully we can improve this for Trixie.

And a nitpick: the way this appears in the hw-detect prompt screen in the
installer is a bit odd, because spaces in the filenames are replaced with
newlines.  That might be nice to fix at the same time if we can.

Thanks,
James

[1] - https://forums.raspberrypi.com/viewtopic.php?t=282839

[2] - 
https://salsa.debian.org/installer-team/hw-detect/-/blob/f76d36b65aa14a14497f5ef57c9721f313ea98e6/check-missing-firmware.sh#L154-187

[3] - https://salsa.debian.org/

[4] - https://packages.debian.org/bookworm/all/raspi-firmware/filelist



Bug#1031738: installation-guide: documentation about limits to kernel boot parameters is outdated

2023-03-07 Thread James Addison
Source: installation-guide
Followup-For: Bug #1031738

I'm attempting to rephrase the documentation related to this bug in a merge
request on Salsa:

https://salsa.debian.org/installer-team/installation-guide/-/merge_requests/24/

If anyone would like to review the suggested changes there (or even better,
test their behaviour (I've potentially made a fool of myself on lkml after
misunderstanding the difference between kernel parameters and init arguments)),
then I'd be grateful for your help.



Bug#1031738: installation-guide: documentation about limits to kernel boot parameters is outdated

2023-03-06 Thread James Addison
Source: installation-guide
Followup-For: Bug #1031738
Control: tags -1 patch

Please note: the patch previously offered here isn't suitable; it turns out
that the param limits described in it (32 for Linux, 128 for User Mode Linux)
apply only to the number of argument items that are passed to the 'init'
process from the kernel command line -- not to the number of known kernel
parameters ('ro', 'quiet', and so on) that a kernel will accept at boot-time.

References:

 - 
https://lore.kernel.org/linux-doc/CALDQ5NwGTi3q9B=ezat5h_eltr1cdur9j13utb1-dck-fxo...@mail.gmail.com/T/#t

 - 
https://salsa.debian.org/installer-team/installation-guide/-/merge_requests/24#note_387450



Bug#370487: partitions get formated after unsure confirmation in graphical installer

2023-03-02 Thread James Addison
Source: partman-base
Followup-For: Bug #370487

On Thu, 02 Mar 2023 13:35:29 +, James wrote:

> If the user selects 'undo', or if 'no' is selected and the operator continues
> from the second step, then the user is taken back to the 'partition disks'
> menu without any changes being written to disk (I checked that the 
> modification
> time on the file-backed disk image I was using in testing had not been 
> updated;
> not a cast-iron guarantee, but a fairly strong indicator I think).

A pedantic follow-up note related to this: I also confirmed the converse
behaviour (that is: after selecting 'yes' to write changes to disk, the
modification time on the file-backed disk image was updated as expected,
indicating that changes were indeed written in that case).



Bug#370487: partitions get formated after unsure confirmation in graphical installer

2023-03-02 Thread James Addison
Source: partman-base
Followup-For: Bug #370487
X-Debbugs-Cc: geye...@googlemail.com

Oops: I provided an update earlier today on this bug that you reported in 2006,
but I forgot to cc your email address, Thomas.  Please find my previous comment
in the bug thread.



Bug#370487: partitions get formated after unsure confirmation in graphical installer

2023-03-02 Thread James Addison
Source: partman-base
Followup-For: Bug #370487

Hi Thomas,

I discovered this bug while wondering about some of the default options during
the disk partition configuration steps (also known as the 'partman' component)
in the Debian 12 (bookworm) alpha 2 installer.

Since disk partition configuration continues to be important, the partman steps
are still in place and the overall intent remains similar to your experience of
the Jun 2006 Debian installer: select an installation disk, determine a
partition layout, confirm the changes, and then write them to the selected disk.

I had two goals in mind:

  * To learn more about whether the default selection of 'no' before changes
are written to disk makes sense (that's how I found your bugreport)

  * Check whether the bug you reported exists in the Debian 12 installer

Although the menu screens that are involved may vary based on the system
operator's selections during installation, there is currently a two-step
confirmation after a partitioning scheme is determined:

  1. The operator is given the choice to 'undo' the partitioning changes, or
 to 'finish' the partitioning and write the results to disk

  2. If the user selected 'finish', there is a further confirmation dialog --
 with no option to go 'back', and a default 'no', asking whether to write
 the changes to disk

If the user selects 'undo', or if 'no' is selected and the operator continues
from the second step, then the user is taken back to the 'partition disks'
menu without any changes being written to disk (I checked that the modification
time on the file-backed disk image I was using in testing had not been updated;
not a cast-iron guarantee, but a fairly strong indicator I think).

There is another edge case, though: a user can select an entire disk to install
to -- without any partitions.  In that case the confirmation flow is slightly
different:

  1. The operator is notified that they have selected an entire-device to
 install to, and they may 'go back' or they are given a no/yes choice to
 create an empty partition table.

I've confirmed that with either the 'go back' or 'no' choices from that dialog,
the file-backed disk image remains unmodified.

And to answer my own question that brought me here: I think that the default
of 'no' makes sense.  In some cases it could cause a small delay if the user
does not read the dialog text and attempts to continue -- but writing disk
partition changes can be risky, and so I do think it makes sense to use 'no'
as the default on those menus.

Perhaps my explanation could have been more concise, but I hope that provides
a relevant update for this bug ahead of Debian 12's release.



Bug#1031923: d-i.debian.org: testing (bookworm): Unable to boot due to unsupported FEATURE_C12 in e2fsck

2023-02-28 Thread James Addison
Package: d-i.debian.org
Followup-For: Bug #1031923
Control: forcemerge 1031622 -1



Bug#1031738: installation-guide: documentation about limits to kernel boot parameters is outdated

2023-02-21 Thread James Addison
Source: installation-guide
Version: 20220129~deb11u1
Severity: normal
Tags: d-i patch

Dear Maintainer,

Some of the documentation related to limits in Linux kernel boot parameters in
the installation guide is outdated.

For example, the section describing[1] use of boot parameters for preseeding
references the 2.6.9 kernel in a callout note.

Please find attached a patch to update two such documentation sections.

Thanks,
James

[1] - 
https://www.debian.org/releases/bullseye/amd64/apbs02.en.html#preseed-bootparms
>From 55dbd6dae622cfb40e3397bc98e4e053b14ad45c Mon Sep 17 00:00:00 2001
From: James Addison 
Date: Tue, 21 Feb 2023 18:07:01 +
Subject: [PATCH] linux-kernel: update notes related to command-line limits

---
 en/appendix/preseed.xml  | 9 +
 en/boot-installer/parameters.xml | 9 +
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/en/appendix/preseed.xml b/en/appendix/preseed.xml
index 0a4582149..1ed69b83b 100644
--- a/en/appendix/preseed.xml
+++ b/en/appendix/preseed.xml
@@ -367,10 +367,11 @@ out any options (like preconfiguration options) that it 
recognizes.
 
 
 
-Current linux kernels (2.6.9 and later) accept a maximum of 32 command line
-options and 32 environment options, including any options added by default
-for the installer. If these numbers are exceeded, the kernel will panic
-(crash). (For earlier kernels, these numbers were lower.)
+Debian-built Linux kernels accept up to the default maximum of 32 command line
+options and 32 environment options. If these numbers are exceeded, the kernel
+will fail to boot. Additionally, there is an architecture-dependent limit to
+the number of characters for the entire kernel command line; command-lines
+longer than this limit are silently truncated.
 
 
 
diff --git a/en/boot-installer/parameters.xml b/en/boot-installer/parameters.xml
index 91d981406..18f87dc25 100644
--- a/en/boot-installer/parameters.xml
+++ b/en/boot-installer/parameters.xml
@@ -75,10 +75,11 @@ The installation system recognizes a few additional boot 
parameters
 
 
 
-With current kernels (2.6.9 or newer) you can use 32 command line options and
-32 environment options. If these numbers are exceeded, the kernel will panic.
-Also there is a limit of 255 characters for the whole kernel command line,
-everything above this limit may be silently truncated.
+Debian-built Linux kernels accept up to the default maximum of 32 command line
+options and 32 environment options. If these numbers are exceeded, the kernel
+will fail to boot. Additionally, there is an architecture-dependent limit to
+the number of characters for the entire kernel command line; command-lines
+longer than this limit are silently truncated.
 
 
 
-- 
2.39.1



Bug#987368: Installer fails at first menu "Choose language"

2021-06-06 Thread James Addison
Thanks Cyril, Frédéric - it feels like we're reaching a consensus that
udpkg may not be multi-process safe (although, strictly speaking, I
would say we haven't proven that yet).

The authors of multi-console support could be the best people to
recommend a path forward, as they may have close knowledge of the
level of testing and completion of the change.  I've attempted to add
them as subscribers to the bug although I expect that is opt-in and
I'm not sure whether I added them correctly.

Until any feedback from them, I'll mention a few possible routes that
had occurred to me:

- Backtracking: if we feel this is a problem that would likely affect
and/or annoy a significant number of users, we could disable
multi-console support for bullseye
- Known-issue: if we feel the issue is serious but rare, we could
indicate that it is a known problem (and perhaps prepare and document
workarounds)
- Scripting fix: we could perhaps adjust the installation scripts so
that d-i runs in a single-process condition until after udpkg has
completed, and only begin multiple debian-installer processes after
that
- Process-safety fix: in some sense an 'ideal' fix, we could add
multi-process safety to udpkg, either by using careful rewriting or
perhaps by using a lockfile or other safety mechanism(s)

Some related factors to consider:

- Do we advertise and support multi-process debian-installer support
in our documentation?
- Do we have availability of developer expertise for udpkg, including
review and QA time?
- Could/should the distance to a release date of Debian bullseye be a factor?

Cheers,
James

On Mon, 31 May 2021 at 10:31, Frédéric Bonnard  wrote:
>
> Hi Cyril/all,
> sorry that the process takes long, but that was the only way to
> reproduce that bug (which I think may not be specific to ppc64el)
> without having Power hardware (and a LPAR/HMC setup).
>
> > Looking at that log, one sees two PIDs for main-menu (272 and 278),
> > which could explain a very nice race condition: udpkg racing, one of
> > them making the status file disappear from under the feet of the other
> > one?
>
> My feeling is that this is exactly what's happening.
> I tried touching the missing file and the installer was happy because
> the called process (udpkg from what I remember) does not crash anymore
> (one can try udpkg without status file and it will crash).
>
> > See also two /sbin/debian-installer processes getting started beforehand
> > (one on /dev/hvc0, one on /dev/tty).
>
> Exactly.
>
> > It looks to me this is a clear problem on the debian-installer side (how
> > it deals with multiple consoles, similar to #940028 as you mentioned
> > initially), rather than some possible issues with emulation?
>
> To me, it's clearly not a qemu issue, since I have that issue on
> physical machines too. I just went the emulation way to enable people
> without hardware to reproduce the bug. It's more a race condition of
> running two debian-installers (not sure now who is remove the status
> file, probably udpkg ?).
>
> The point is that some work has already been done by several people on
> those multiple consoles setup according to the git commits , and I guess
> they will clearly get a grasp of what's going on.
>
>
> F.



Bug#987368: Installer fails at first menu "Choose language"

2021-05-28 Thread James Addison
Does d-i tend to use udpkg for bootstrapping?

If so, I think 
https://salsa.debian.org/installer-team/udpkg/-/blob/master/status.c#L390
could be a potential section of code to investigate further.

It doesn't look like full-fat dpkg performs these kind of renames on
the status file.



Bug#987368: Installer fails at first menu "Choose language"

2021-05-28 Thread James Addison
This might also be the same issue as reported in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=944125

(some kind of race condition where multiple consoles are available and
entered into the inittab, and a /var/lib/dpkg/status.bak is found
instead of the expected status file)