Hi,

On Tuesday, 21 June 2022 22:31:45 CEST Paul Gevers wrote:
> On 21-06-2022 22:07, Diederik de Haas wrote:
> 
> > Do these errors still occur? Still with 5.10.103-1 or a later one?
> 
> The last occurrence of a machine hang I had is from 5 May 2022, but I'm 
> not sure if I checked if it was this same issue. Normally our kernels 
> are up-to-date, but I don't recall what we had at the time. We have 
> recommissioned our arm64 hosts, so the install logs are lost by now.

It's good for ci.debian.net that there are such large gaps between failures, 
but it makes debugging a bit harder.
I think that the install logs aren't that important (anymore) as the issue/
symptoms appear to be the same:
- some swap action resulting in some failure
- CPU gets stuck
- watchdog triggers a reboot

How is swap configured on these devices?

> > Is it only on arm64 machines? Or is this just an example which also
> > occurs on other arches?
> 
> I'm pretty sure I haven't seen this on other arches, otherwise I'm sure 
> I would have reported it to this bug.

Yeah, I _assumed_ as such, but assumptions can be dangerous ;-)

Normally I scroll (hard) by the hardware listings as that rarely says anything 
to me. And I did that before too, but just now I made an important discovery.

I *assumed* it was running on arm64 (native) hardware and was about to ask 
specifics about it and then I noticed this:
Host bridge [0600]: Red Hat, Inc. QEMU PCIe Host bridge [1b36:0008]

Qemu. Quite likely unrelated, but a while back I had an issue with qemu in 
building arm64 images: https://bugs.debian.org/988174

I think it would be useful to know which qemu version(s) were used.
(It's unlikely I'll be able to help find the cause/solution, mostly gathering 
hopefully useful bits of information for people who could)

> > If it still occurs, then the likely only way to get a possible resolve is
> > reporting it to upstream.
> 
> 1.5 months is quite long for it to be gone, although, before that it was 
> 2.5 months.

If the issue does occur again, I think it would be useful to bring 'upstream' 
into the conversation. They likely can bring much more useful input into this 
then (f.e.) I could. Also, if upstream is made aware there is an issue (even 
infrequent), then they can make the most informed choice what to do with it.

Cheers,
  Diederik

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to