Bug#1004255: linux-image-5.14.0-1-sparc64-smp: Debian kernels > 5.14.3-1~exp1 fail to boot on SPARC T4-1 with Fast Data Access MMU Miss

2022-01-23 Thread Tom Turelinckx
Package: src:linux
Version: 5.14.6-2
Severity: important
X-Debbugs-Cc: debian-sparc@lists.debian.org

Dear Maintainer,

Debian kernels > 5.14.3-1~exp1 consistently fail to boot on SPARC T4-1:

SPARC T4-1, No Keyboard
Copyright (c) 1998, 2014, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.36.2, 31.5000 GB memory available, Serial #108045182.
Ethernet address 0:10:e0:70:a3:7e, Host ID: 8670a37e.



Boot device: disk0  File and args: 
SILO Version 1.4.14
boot: 
Allocated 64 Megs of memory at 0x4000 for kernel
Uncompressing image...
Loaded kernel version 5.14.6
Loading initial ramdisk (25723814 bytes at 0x2480 phys, 0x40C0 virt)...
ERROR: Last Trap: Fast Data Access MMU Miss

Debian kernels 5.14.3-1~exp1 and earlier boot and run successfully on this 
system.

I have tried the sparc64-smp packages built by buildd landau for these versions:
5.14.6-2, 5.14.6-3, 5.14.9-2, 5.15.5-2, 5.15.15-1, 5.16~rc8-1~exp1
They all consistently fail to boot with the same error.

I have built the Debian src pkg version 5.14.6-1 using pbuilder with a sid 
basetgz.
It consistently fails to boot with the same error.

I've then tried to bisect using the DebianKernel/GitBisect instructions on the 
Debian wiki,
but it turns out that kernels built from git (tag v5.14.3, tag v5.14.6, and ~9 
bisects in between)
using make bindeb-pkg all do boot successfully on this system.

I've tried checking out tag v5.14.6 from git, then applying all the patches 
from debian/patches
in the 5.14.6-1 src pkg and building using make bindeb-pkg. The resulting 
kernel boots successfully.

I've tried extracting the 5.14.6-1 src pkg using dpkg-source -x, then building 
using make bindeb-pkg
and the resulting kernel boots succesfully. But if I build using 
dpkg-buildpackage like pbuilder does,
then the resulting -sparc64-smp package fails to boot with the above error.

When building, I have used each time a clean sid changeroot. When using make 
bindeb-pkg I have copied
the config installed in /boot by the (non-booting) 5.14.6-1 Debian package then 
done make olddefconfig.
When using make bindeb-pkg I had to manually disable stringop-overread warnings 
in Makefile to avoid
build failure on arch/sparc/kernel/mdesc.c with v5.14.6 (fixed in later 
versions by [1]). 

When building using bindeb-pkg the resulting kernel is compressed; when using 
dpkg-buildpackage the
resulting kernel is uncompressed. I have tried both uncompressing the 
compressed kernel and compressing
the uncompressed kernel, as silo supports both. It doesn't affect the results. 
Uncompressed, the Debian
kernel is ~17MB while the standard kernel is ~13MB. I'm not sure why this 
difference is there.

On Debian salsa's kernel-team/linux I have combed through all the commits 
between tags debian/5.14.3-1_exp1
and debian/5.14.6-1, but none of them seem relevant to this issue. I have 
checked the upstream changelog
between v5.14.3 and v5.14.6, but nothing sparc-specific has changed. 

According to the buildd logs, landau is running kernel 5.15.5-2. But I think 
this is a SPARC-T5 so not
a T4, and I think it's running inside an LDOM which is not the case on my T4, 
so it may not be comparable.

I've also tried to get more information about the failure, but I don't know how 
to do that. I've tried
to get into the initramfs environment by using break=premount/modules/top, but 
the failure happens before
those stages. Measuring the elapsed time after Loading initial ramdisk it would 
seem the error message
ERROR: Last Trap: Fast Data Access MMU Miss appears when normally the first 
kernel output would appear.

I've tried to look into what the Debian src pkg's debian/* scripts do, exactly, 
but this is rather
complicated and I have limited experience with it.

Any suggestions what else I could try?

[1]: 
https://github.com/gregkh/linux/commit/fc7c028dcdbfe981bca75d2a7b95f363eb691ef3

-- Package-specific info:
** Kernel log: boot messages should be attached

** Model information
cpu : UltraSparc T4 (Niagara4)
fpu : UltraSparc T4 integrated FPU
pmu : niagara4
prom: OBP 4.36.2 2014/10/24 08:13
type: sun4v

** Network interface configuration:
*** /etc/network/interfaces:

source /etc/network/interfaces.d/*

auto lo
iface lo inet loopback

auto br0
iface br0 inet static
bridge_ports enp15s0f0
bridge_fd 0
address x.x.x.x
netmask x.x.x.x
gateway x.x.x.x
iface enp15s0f0 inet manual


** PCI devices:
00:01.0 PCI bridge [0604]: Oracle/SUN Device [108e:8186] (rev 01) (prog-if 00 
[Normal decode])
Device tree node: /sys/firmware/devicetree/base/pci@400/pci@1
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: 

00:02.0 PCI bridge [0604]: Oracle/SUN Device 

Bug#956624: qemu: FTBFS on sparc64 due to libseccomp-dev dependency

2020-04-13 Thread Tom Turelinckx
Source: qemu
Severity: normal
User: debian-sparc@lists.debian.org
Usertags: sparc64

Hello,

Commit 63f51933 resolved bug #900055 by enabling seccomp on linux-any.

Since version 1:2.12+dfsg-2 qemu FTBFS due to this dependency, while it was 
building successfully before:

https://buildd.debian.org/status/logs.php?pkg=qemu=sparc64

Instead of enabling seccomp on linux-any, maybe it could be enabled on 
something like this:

[!alpha !ia64 !m68k !sh4 !sparc64 !hurd-any !kfreebsd-any]

Thanks,
Tom



RE: Stretch bootstrap for sparc64 and sparc, optimized for ultrasparc3

2017-09-08 Thread Tom Turelinckx
Hello Adrian,

On Thu, Sep 7, 2017, at 07:22 PM, John Paul Adrian Glaubitz wrote:
> We are actually planning on implementing Britney for Debian Ports which
> means that Debian Ports would also get a testing release.

That would be great!

> You can just easily cross-compile a native gcc compiler with sbuild.
> That's actually
> very easy using the cross-build toolchain available in Debian.

Thanks for the pointer, this will help me a lot!

> There isn't really a point in bootstrapping for sparc these days unless
> you set the default CPU to sparv8 to be able to run Debian on a sun4m
> machine. sun4u or newer machines should use sparc64.

> Your efforts sound like you're trying to reinvent the wheel in my opinion.

We have different perspectives on those ;-)

> I think it makes more sense to help to make Britney and hence testing
> available in Debian Ports.

One does not exclude the other. Do you know who is working on that?

Thanks,
Tom




Stretch bootstrap for sparc64 and sparc, optimized for ultrasparc3

2017-09-07 Thread Tom Turelinckx
Hi,

On Thu, Sep 7, 2017, at 11:19 AM, John Paul Adrian Glaubitz wrote:
> On 09/07/2017 10:30 AM, Tom Turelinckx wrote:
>> Not all of those may be necessary anymore, but I've been doing it like
>> that since squeeze and up to the current sid on dozens of machines, and
>> it works reliably: when the first disk fails, I am able to boot from the
>> second disk.
> 
> On a sidenote: Would you mind enabling popcon for your sparc64
> installations,
> so we get more counts of people running Debian on sparc64 hardware?

Enabling it for my sparc64 installations wouldn't be very useful, as those are 
just two or three machines (V210, V240, T5140) with one or two LXC containers 
each, and they're only used for experiments, so they're not representative.

Enabling it for my sparc installations would cause quite a spike in the Wheezy 
deployments, as those are 15-20 machines (V120, V210, V240, V440) with ~4 LXC 
containers each. Those currently can't be upgraded because Sid is too volatile.

Because I want to move forward with upgrading some of those machines to a newer 
release, and replacing some of the Sun Fire-series hardware with SPARC 
Enterprise-series hardware, I'm working on bootstrapping Stretch for both 
sparc64 and sparc, preferably --with-cpu(-32/-64)=ultrasparc3 instead of 
ultrasparc.

Just bootstrapping Stretch 9.0 for sparc64 is relatively straightforward thanks 
to excellent work by Adrian Glaubitz and others: for most of the binary 
packages the correct version is readily available from snapshot.debian.org. 
I've built a repository of binary packages to create Stretch 9.0 for sparc64 
that has either the correct version of each package or slightly older, up to 
build-essential and some additional packages such as the kernel.

>From this repository I can create a pbuilder basetgz that's very close to 
>Stretch 9.0, and allows to build additional packages to bring the 
>"very-close-to-9.0" repo to "really-9.0", as well as up to 9.1. I've also 
>installed a physical machine using a fairly old sid cd where all package 
>versions were older than Stretch 9.0, then upgraded using this repository, so 
>I have a physical machine that's "very-close-to-9.0" but hasn't seen any 
>packages "from the future" to run pbuilder on.

The only problem here is how to automatically select the correct binNMU for 
sparc64 for a given source package version, for Stretch 9.0. I think it can't 
be automated (correctly), so I verified it manually for the packages that I've 
done, but it's unrealistic to do so for the entire archive. Once a certain 
amount of packages has been made available through this method, it seems easier 
to start (automatically) recompiling additional packages from source, rather 
than (manually) pulling in the correct binNMU from snapshot...

And if I do start recompiling a large amount of packages, I intend to optimize 
them for ultrasparc3 rather than ultrasparc, based on this [1] remark by David 
Miller.

Bootstrapping Stretch for sparc is less straightforward, as the latest/last 
packages available from snapshot are from two years ago. Still, there is a 
large amount of packages available that is not terribly old. Thanks to 
excellent work by Helmut Grohne and others [2], it's also possible to 
cross-compile a recent version of the most important packages from source.

It took only minimal patches to get rebootstrap to work against Stretch 9.0. It 
finishes successfully, and I have repositories available for both sparc and 
sparc64. Unfortunately, build-essential is not (yet) fully complete: 
rebootstrap does not (yet) produce a native gcc. At jenkins.debian.net, builds 
considered successful finish in the same state, so I guess producing all the 
build dependencies to produce a native gcc is (currently) out of the scope of 
rebootstrap. I'm working on creating a useful native pbuilder chroot for sparc 
(similar to what I have for sparc64), either by pulling packages from 
rebootstrap into a chroot created from snapshot, or by pulling packages from 
snapshot into the cross-compiling chroot from rebootstrap.

I'm also investigating botch and dose to determine the optimal order for 
mass-building all the packages. Because so many packages are available, build 
dependencies will probably not be a problem, but by doing them in the optimal 
order, it might be possible to get correctly ultrasparc3-optimized versions of 
all packages in one or maybe two runs, and I think a random order might require 
more such runs.

Having Stretch repositories with a significant number of packages available for 
both sparc64 and sparc would open up a lot of opportunities for testing and 
actually using the port, and having them optimized for ultrasparc3 would allow 
useful performance testing on a broad range of currently relevant hardware. If 
I succeed in building the repositories, I hope to make them public.

Tom

> [1] https://patchwork.ozlabs.org/comment/927979/
> [2] https://wiki.debian.org/HelmutGrohne/rebootstrap




Re: Re: mdadm /boot mirror and sun disklabel corruption

2017-09-07 Thread Tom Turelinckx
Hi Fedor,

> For example I make partitions on two disks like the following:
> 
> 1. 500MB for /boot - boot partition
> 2. 2GB for swap - swap
> 3. Whole disk - sun's whole disk
> 4. 31,6GB for / - rest for the root fs
> 
> Then I create metadevices (mirrors) for partitions 1,2 and 4.

I'm using a similar layout for boot disks.

I leave the first cylinder on the disk unused, except for partition 3 (whole 
disk); I limit partition 1 (boot partition) so the end of the partition falls 
within 512MB from the start of the disk; I use v0.90 metadata for md1 (/boot), 
and format it as ext2.

Not all of those may be necessary anymore, but I've been doing it like that 
since squeeze and up to the current sid on dozens of machines, and it works 
reliably: when the first disk fails, I am able to boot from the second disk.

Tom




Re: sparc64 as a release architecture

2017-08-16 Thread Tom Turelinckx
Hello Adrian,

Meanwhile, we have been testing the current sparc64 port on several
development machines, both sun4v and sun4u, and the results are really
quite promising: the major packages which we're using are all there,
they are functional and up-to-date, and on identical hardware,
performance is roughly the same or better compared to wheezy.

> In order for sparc64 to become a release architecture, the buildds and
> porter boxes must be maintained by DSA. Currently, all sparc64
> hardware used within Debian is maintained by ports people.

> Unfortunately, the hardware acquisition process at Oracle takes quite
> long which is why we are still stuck in Debian Ports. Sorry.

What's holding us back in migrating some production machines away from
wheezy (and replacing some older sun4u hardware with newer sun4u or
sun4v machines) is not so much that sparc64 is a ports architecture
rather than a release architecture, but that sparc64 as a ports
architecture is limited to the Unstable suite, which is too much of a
moving target.

The DSA team is demanding the availability of redundant hardware under a
support contract in order to be able to support sparc64 as a release
architecture with a certain quality of service. That is entirely
understandable. Unfortunately, that demand for hardware under a support
contract is also currently preventing us from testing the sparc64 port
against the Testing suite in a more production-like setting, hence
preventing us from providing early feedback on the port, even though the
Testing suite could probably be built quite easily in its current shape,
there was no shortage of hardware offers (without a support contract) to
provide the necessary build capacity, and the level of service proposed
by the DSA team is not currently requested for the sparc64 port.

In fact, there would be nothing wrong with sparc64 being (and staying) a
ports architecture, if it were possible for ports architectures to
support any suite, instead of just Unstable.

Is the build infrastructure and configuration for ports architectures
entirely separate from that for release architectures? If hardware was
promised by Oracle in such a way that the DSA team is confident their
requirements will be met in the near future, would it be possible to
make sparc64 a release architecture now, and start building the Testing
suite on the existing infrastructure? If not, what are the issues
preventing the Testing and Stable suites from being built for ports
architectures, and what is needed to resolve those issues?

Thanks,
Tom



sparc64 as a release architecture

2017-07-05 Thread Tom Turelinckx
Hi,

Now that Stretch has been released as Stable, and Buster has become Testing, 
what is the planning for turning sparc64 into a release architecture rather 
than a ports architecture?

Thanks,
Tom




Re: Debian Sparc 7.10.0 Install Problems

2016-06-08 Thread Tom Turelinckx
> on a V210, with gui which is stable. Problem is that none of the
> repositories for Squeeze work any more and it's not clear if they
> are available somewhere else, or have just been deleted ?.

Squeeze has been moved to the debian archive. This sources.list snippet
should work:

deb http://archive.debian.org/debian-archive/debian squeeze main contrib
non-free
deb-src http://archive.debian.org/debian-archive/debian squeeze main contrib
non-free

Best regards,
Tom