Processed: Properly reassigning ...

2022-06-21 Thread Debian Bug Tracking System
Processing commands for cont...@bugs.debian.org:

> reassign 1013330 src:linux 5.18.2-1
Bug #1013330 [linux-image-5.18.0-0.bpo.1-arm64] 
linux-image-5.18.0-0.bpo.1-arm64: kernel panic in dpaa2_eth_free_tx_fd
Bug reassigned from package 'linux-image-5.18.0-0.bpo.1-arm64' to 'src:linux'.
No longer marked as found in versions linux-signed-arm64/5.18.2+1~bpo11+1.
Ignoring request to alter fixed versions of bug #1013330 to the same values 
previously set
Bug #1013330 [src:linux] linux-image-5.18.0-0.bpo.1-arm64: kernel panic in 
dpaa2_eth_free_tx_fd
Marked as found in versions linux/5.18.2-1.
>
End of message, stopping processing here.

Please contact me if you need assistance.
-- 
1013330: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1013330
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#1013330: linux-image-5.18.0-0.bpo.1-arm64: kernel panic in dpaa2_eth_free_tx_fd

2022-06-21 Thread Diederik de Haas
Control: reassign src:linux 5.18.2-1

On Tuesday, 21 June 2022 23:31:29 CEST Harald Welte wrote:
> Package: linux-image-5.18.0-0.bpo.1-arm64
> Version: 5.18.2-1~bpo11+1
> 
> today I briefly tried the backport 5.18 kernel on bullseye. It boots fine,
> but as soon as some network traffic happens, it panics with a backtrace
> indicating  some kind of problem in the dpaa2_eth netwokr driver.
> 
> [   46.451190] Unable to handle kernel paging request at virtual address
> fcf7fe08 [   46.459126] Mem abort info:
> [   46.461937]   ESR = 0x9605
> [   46.464983]   EC = 0x25: DABT (current EL), IL = 32 bits
> [   46.470301]   SET = 0, FnV = 0
> [   46.473347]   EA = 0, S1PTW = 0
> [   46.476491]   FSC = 0x05: level 1 translation fault
> [   46.481373] Data abort info:
> [   46.484257]   ISV = 0, ISS = 0x0005
> [   46.488095]   CM = 0, WnR = 0
> [   46.491067] swapper pgtable: 4k pages, 48-bit VAs, pgdp=8258f000
> [   46.497786] [fcf7fe08] pgd=102f78387003,
> p4d=102f78387003, pud= [   46.506496] Internal error:
> Oops: 9605 [#1] SMP

Kernel 5.18.3 contains (at least) 2 patches related to dpaa2-eth.
Kernel 5.18.5-1 (currently in Sid) does contain quite a few fixes vs 5.18.2, so 
it would be useful to verify if that fixes your issue. I don't know when or 
what version becomes available next in Stable Backports though.

signature.asc
Description: This is a digitally signed message part.


Bug#1013330: linux-image-5.18.0-0.bpo.1-arm64: kernel panic in dpaa2_eth_free_tx_fd

2022-06-21 Thread Harald Welte
Package: linux-image-5.18.0-0.bpo.1-arm64
Version: 5.18.2-1~bpo11+1
Severity: normal

Dear Maintainer,

today I briefly tried the backport 5.18 kernel on bullseye. It boots fine,
but as soon as some network traffic happens, it panics with a backtrace
indicating  some kind of problem in the dpaa2_eth netwokr driver.

The problem can be reproduced 100% within very few seconds after system boot. 
One can
usually still ssh into the machine, but then the first shell command producing
more than a single-line output (like ls -l /etc) makes the kernel panic like 
below.

As soon as I downgraded back to linux-image-5.10.0-15-arm64 = 5.10.120-1 the 
problem
disappeared.  On 5.10.120-1 the network runs very stable.

[   46.451190] Unable to handle kernel paging request at virtual address 
fcf7fe08
[   46.459126] Mem abort info:
[   46.461937]   ESR = 0x9605
[   46.464983]   EC = 0x25: DABT (current EL), IL = 32 bits
[   46.470301]   SET = 0, FnV = 0
[   46.473347]   EA = 0, S1PTW = 0
[   46.476491]   FSC = 0x05: level 1 translation fault
[   46.481373] Data abort info:
[   46.484257]   ISV = 0, ISS = 0x0005
[   46.488095]   CM = 0, WnR = 0
[   46.491067] swapper pgtable: 4k pages, 48-bit VAs, pgdp=8258f000
[   46.497786] [fcf7fe08] pgd=102f78387003, p4d=102f78387003, 
pud=
[   46.506496] Internal error: Oops: 9605 [#1] SMP
[   46.511364] Modules linked in: caam_jr crypto_engine rng_core aes_ce_blk 
aes_ce_cipher ghash_ce dpaa2_caam gf128mul caamhash_desc sha2_ce caamalg_desc 
sha256_arm64 authenc libdes sha1_ce dpaa2_console caam ofpart error lm90 
spi_nor at24 mtd sbsa_gwdt qoriq_thermal evdev layerscape_edac_mod 
qoriq_cpufreq drm fuse configfs ip_tables x_tables autofs4 ext4 crc16 mbcache 
jbd2 crc32c_generic dm_mod dax fsl_dpaa2_ptp fsl_dpaa2_eth xhci_plat_hcd 
xhci_hcd usbcore nvme nvme_core ahci_qoriq t10_pi libahci_platform libahci 
at803x libata fsl_mc_dpio crc64_rocksoft ptp_qoriq crc64 xgmac_mdio pcs_lynx 
acpi_mdio phylink crc_t10dif mdio_devres rtc_pcf2127 ptp of_mdio 
i2c_mux_pca954x crct10dif_generic regmap_spi i2c_mux dwc3 fixed_phy pps_core 
fwnode_mdio scsi_mod udc_core sfp crct10dif_ce sdhci_of_esdhc crct10dif_common 
mdio_i2c roles sdhci_pltfm ulpi scsi_common usb_common libphy sdhci 
spi_nxp_fspi i2c_imx fixed gpio_keys
[   46.591702] CPU: 7 PID: 822 Comm: sshd Not tainted 5.18.0-0.bpo.1-arm64 #1  
Debian 5.18.2-1~bpo11+1
[   46.600736] Hardware name: SolidRun LX2160A Clearfog CX (DT)
[   46.606383] pstate: a005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   46.613332] pc : kfree+0x78/0x290
[   46.616644] lr : dpaa2_eth_free_tx_fd.isra.0+0x308/0x3b4 [fsl_dpaa2_eth]
[   46.623341] sp : 8aa3b2d0
[   46.626643] x29: 8aa3b2d0 x28: 3e200d37a800 x27: 3e2005045d00
[   46.633769] x26: 0001 x25: 0001 x24: 0002
[   46.640895] x23: b76243dab000 x22: b76239b320e8 x21: faee1740
[   46.648020] x20: 3dff8000 x19: fcf7fe00 x18: 
[   46.655145] x17: 86cc769fc000 x16: b762425450d0 x15: 4000
[   46.662270] x14:  x13: c2008000 x12: 0001
[   46.669395] x11: 0004 x10: 0008 x9 : b76239b320e8
[   46.676520] x8 :  x7 : 000faee2 x6 : 3e2000ce4a00
[   46.683645] x5 : b76243196000 x4 : 0003 x3 : 0009
[   46.690769] x2 :  x1 : 0030 x0 : fc00
[   46.697894] Call trace:
[   46.700328]  kfree+0x78/0x290
[   46.703286]  dpaa2_eth_free_tx_fd.isra.0+0x308/0x3b4 [fsl_dpaa2_eth]
[   46.709631]  dpaa2_eth_tx_conf+0xb0/0x19c [fsl_dpaa2_eth]
[   46.715020]  dpaa2_eth_poll+0xf4/0x3b0 [fsl_dpaa2_eth]
[   46.720149]  __napi_poll+0x40/0x1dc
[   46.723628]  net_rx_action+0x2fc/0x390
[   46.727366]  __do_softirq+0x120/0x348
[   46.731017]  __irq_exit_rcu+0x10c/0x140
[   46.734842]  irq_exit_rcu+0x1c/0x30
[   46.738320]  el1_interrupt+0x38/0x54
[   46.741885]  el1h_64_irq_handler+0x18/0x24
[   46.745970]  el1h_64_irq+0x64/0x68
[   46.749360]  n_tty_poll+0x98/0x1e0
[   46.752752]  tty_poll+0x7c/0x114
[   46.755968]  do_select+0x28c/0x64c
[   46.759361]  core_sys_select+0x238/0x3a0
[   46.763273]  __arm64_sys_pselect6+0x17c/0x280
[   46.767619]  invoke_syscall+0x50/0x120
[   46.771357]  el0_svc_common.constprop.0+0x4c/0xf4
[   46.776051]  do_el0_svc+0x30/0x90
[   46.779354]  el0_svc+0x34/0xd0
[   46.782397]  el0t_64_sync_handler+0x1a4/0x1b0
[   46.786743]  el0t_64_sync+0x18c/0x190
[   46.790396] Code: 8b130293 b25657e0 d34cfe73 8b131813 (f9400660) 
[   46.796478] ---[ end trace  ]---
[   46.801083] Kernel panic - not syncing: Oops: Fatal exception in interrupt
[   46.807945] SMP: stopping secondary CPUs
[   46.811867] Kernel Offset: 0x37623a20 from 0x8800
[   46.817947] PHYS_OFFSET: 0xc2008000
[   46.822116] CPU features: 0x100,4b09,1086
[   

Bug#1001001: linux-image-5.10.0-9-arm64: kernel BUG at include/linux/swapops.h:204!

2022-06-21 Thread Diederik de Haas
Hi,

On Tuesday, 21 June 2022 22:31:45 CEST Paul Gevers wrote:
> On 21-06-2022 22:07, Diederik de Haas wrote:
> 
> > Do these errors still occur? Still with 5.10.103-1 or a later one?
> 
> The last occurrence of a machine hang I had is from 5 May 2022, but I'm 
> not sure if I checked if it was this same issue. Normally our kernels 
> are up-to-date, but I don't recall what we had at the time. We have 
> recommissioned our arm64 hosts, so the install logs are lost by now.

It's good for ci.debian.net that there are such large gaps between failures, 
but it makes debugging a bit harder.
I think that the install logs aren't that important (anymore) as the issue/
symptoms appear to be the same:
- some swap action resulting in some failure
- CPU gets stuck
- watchdog triggers a reboot

How is swap configured on these devices?

> > Is it only on arm64 machines? Or is this just an example which also
> > occurs on other arches?
> 
> I'm pretty sure I haven't seen this on other arches, otherwise I'm sure 
> I would have reported it to this bug.

Yeah, I _assumed_ as such, but assumptions can be dangerous ;-)

Normally I scroll (hard) by the hardware listings as that rarely says anything 
to me. And I did that before too, but just now I made an important discovery.

I *assumed* it was running on arm64 (native) hardware and was about to ask 
specifics about it and then I noticed this:
Host bridge [0600]: Red Hat, Inc. QEMU PCIe Host bridge [1b36:0008]

Qemu. Quite likely unrelated, but a while back I had an issue with qemu in 
building arm64 images: https://bugs.debian.org/988174

I think it would be useful to know which qemu version(s) were used.
(It's unlikely I'll be able to help find the cause/solution, mostly gathering 
hopefully useful bits of information for people who could)

> > If it still occurs, then the likely only way to get a possible resolve is
> > reporting it to upstream.
> 
> 1.5 months is quite long for it to be gone, although, before that it was 
> 2.5 months.

If the issue does occur again, I think it would be useful to bring 'upstream' 
into the conversation. They likely can bring much more useful input into this 
then (f.e.) I could. Also, if upstream is made aware there is an issue (even 
infrequent), then they can make the most informed choice what to do with it.

Cheers,
  Diederik

signature.asc
Description: This is a digitally signed message part.


Bug#1001001: linux-image-5.10.0-9-arm64: kernel BUG at include/linux/swapops.h:204!

2022-06-21 Thread Paul Gevers

Hi Diederik,

On 21-06-2022 22:07, Diederik de Haas wrote:

Do these errors still occur? Still with 5.10.103-1 or a later one?


The last occurrence of a machine hang I had is from 5 May 2022, but I'm 
not sure if I checked if it was this same issue. Normally our kernels 
are up-to-date, but I don't recall what we had at the time. We have 
recommissioned our arm64 hosts, so the install logs are lost by now.



Is it only on arm64 machines? Or is this just an example which also occurs
on other arches?


I'm pretty sure I haven't seen this on other arches, otherwise I'm sure 
I would have reported it to this bug.



If it still occurs, then the likely only way to get a possible resolve is
reporting it to upstream.


1.5 months is quite long for it to be gone, although, before that it was 
2.5 months.


Paul


OpenPGP_signature
Description: OpenPGP digital signature


Processed: Re: Bug#1001001: linux-image-5.10.0-9-arm64: kernel BUG at include/linux/swapops.h:204!

2022-06-21 Thread Debian Bug Tracking System
Processing control commands:

> found -1 linux/5.10.103-1
Bug #1001001 [src:linux] linux-image-5.10.0-9-arm64: kernel BUG at 
include/linux/swapops.h:204!
Marked as found in versions linux/5.10.103-1.

-- 
1001001: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1001001
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#1001001: linux-image-5.10.0-9-arm64: kernel BUG at include/linux/swapops.h:204!

2022-06-21 Thread Diederik de Haas
Control: found -1 linux/5.10.103-1

Hi Paul,

On Tuesday, 29 March 2022 20:58:59 CEST Paul Gevers wrote:
> On 20-02-2022 13:44, Paul Gevers wrote:
> 
> > Sad to say, but this week we had two hangs again.
> 
> And this week another two.
> 
>  ci-worker-arm64-07 ==
> 
> Mar 26 10:15:55 ci-worker-arm64-07 kernel: kernel BUG at 
> include/linux/swapops.h:204!
> Mar 26 10:15:55 ci-worker-arm64-07 kernel: Internal error: Oops - BUG: 0 
> [#1] SMP
> 
> Linux kernel from before the last point release:
> Linux version 5.10.0-12-arm64 (debian-kernel@lists.debian.org) (gcc-10 
> (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2>
> 
>  ci-worker-arm64-08 ==
> Mar 25 22:13:44 ci-worker-arm64-08 kernel: kernel BUG at 
> include/linux/swapops.h:204!
> Mar 25 22:13:44 ci-worker-arm64-08 kernel: Internal error: Oops - BUG: 0 
> [#1] SMP

Do these errors still occur? Still with 5.10.103-1 or a later one?
Is it only on arm64 machines? Or is this just an example which also occurs
on other arches?
Is it possible to try newer kernel versions from Stable-backports to see
whether the issue occurs there too?

If it still occurs, then the likely only way to get a possible resolve is 
reporting it to upstream. For 'swapops.h' that should be this:

~/dev/kernel.org/linux$ scripts/get_maintainer.pl include/linux/swapops.h
Andrew Morton 
Peter Xu 
David Hildenbrand 
Alistair Popple 
Miaohe Lin 
Naoya Horiguchi 
linux-ker...@vger.kernel.org (open list)

But I'm not sure that's the right list as it is from the include directory,
so the actual problem may be somewhere else.
But I guess it would be a good start?

Cheers,
  Diederik

signature.asc
Description: This is a digitally signed message part.


Bug#1002553: firmware-amd-graphics: Memory clock always at 100% (thinkpads w/ryzen 3XXXu)

2022-06-21 Thread ng
I tried on a separate partition to run Fedora with the firmware 
20220310,  and it works correctly there.  You were right.



Have a nice day.

El 12/06/22 a las 8:38, Diederik de Haas escribió:

Control: tag -1 - moreinfo
Control: forwarded -1 https://gitlab.freedesktop.org/drm/amd/-/issues/1455
Control: tag -1 fixed-upstream

On Sunday, 12 June 2022 06:59:12 CEST ng wrote:

Hi, I went on and installed the respective image from
https://snapshot.debian.org/package/linux/5.10.46-5/#linux-image-5.10.0-8-am
d64-unsigned_5.10.46-5 and had no luck, no change whatsoever.

Hi!

Thanks for testing, then this is a separate issue.

https://lore.kernel.org/linux-firmware/CADnq5_PYhDcR3tNYgzQ-uz80Nf++oMPsF3=huk+qcgntiy_...@mail.gmail.com/
is where a fix has been proposed+merged upstream (date: 2022-02-28).

An update of the firmware to version >= 20220310 _should_ fix this issue.
If such a version becomes available, could you test whether it indeed 
fixes

your issue and report your findings back to this bug report?

TIA,
Diederik




Bug#1013299: linux-image-4.19.0-20-amd64: NULL pointer deref in qdisc_put() due to missing backport

2022-06-21 Thread Thorsten Glaser
Diederik de Haas dixit:

>I'm talking here about 4.9, not 4.19 ...

Ah sorry, I can’t keep them distinguished in my head apparently, or
it’s too hot…

>> $ git tag --contains 92833e8b5db6
>> v4.19.221
>> […]
>
>Thanks for that command :-) I usually went through several manual steps to 
>figure out in which release(s) a certain commit was. This is much quicker!

There is also git branch --contains. HTH ☺

>But as said before, I'm going to leave it up to the maintainers on the best 
>way to go about fixing this issue.

Right.

bye,
//mirabilos
-- 
Infrastrukturexperte • tarent solutions GmbH
Am Dickobskreuz 10, D-53121 Bonn • http://www.tarent.de/
Telephon +49 228 54881-393 • Fax: +49 228 54881-235
HRB AG Bonn 5168 • USt-ID (VAT): DE122264941
Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg



Re: virtio_balloon regression in 5.19-rc3

2022-06-21 Thread Ben Hutchings
On Tue, 2022-06-21 at 17:34 +0800, Jason Wang wrote:
> On Tue, Jun 21, 2022 at 5:24 PM David Hildenbrand  wrote:
> > 
> > On 20.06.22 20:49, Ben Hutchings wrote:
> > > I've tested a 5.19-rc3 kernel on top of QEMU/KVM with machine type
> > > pc-q35-5.2.  It has a virtio balloon device defined in libvirt as:
> > > 
> > > 
> > >> > function="0x0"/>
> > > 
> > > 
> > > but the virtio_balloon driver fails to bind to it:
> > > 
> > > virtio_balloon virtio4: init_vqs: add stat_vq failed
> > > virtio_balloon: probe of virtio4 failed with error -5
> > > 
> > 
> > Hmm, I don't see any recent changes to drivers/virtio/virtio_balloon.c
> > 
> > virtqueue_add_outbuf() fails with -EIO if I'm not wrong. That's the
> > first call of virtqueue_add_outbuf() when virtio_balloon initializes.
> > 
> > 
> > Maybe something in generic virtio code changed?
> 
> Yes, we introduced the IRQ hardening. That could be the root cause and
> we've received lots of reports so we decide to disable it by default.
> 
> Ben, could you please try this patch: (and make sure
> CONFIG_VIRTIO_HARDEN_NOTIFICATION is not set)
> 
> https://lore.kernel.org/lkml/20220620024158.2505-1-jasow...@redhat.com/T/

Yes, that patch fixes the regression for me.

Ben.

-- 
Ben Hutchings
Any smoothly functioning technology is indistinguishable
from a rigged demo.


signature.asc
Description: This is a digitally signed message part


Bug#1013299: linux-image-4.19.0-20-amd64: NULL pointer deref in qdisc_put() due to missing backport

2022-06-21 Thread Diederik de Haas
On Tuesday, 21 June 2022 15:34:12 CEST Thorsten Glaser wrote:
> >In branch 'linux-4.9.y' there is no qdisc_put function, so the above check
> >seems rightly in qdisc_destroy there.

I'm talking here about 4.9, not 4.19 ...

> Not any more. Since…
> 
> $ git tag --contains 92833e8b5db6
> v4.19.221
> […]

Thanks for that command :-) I usually went through several manual steps to 
figure out in which release(s) a certain commit was. This is much quicker!

> … qdisc_destroy was renamed to qdisc_put in 4.19, breaking modules (grr).

And yes, it is broken in the 4.19 series since 4.19.221
(And not in 5.10 or upstream 'master')

> So yes, this needs to also be fixed upstream (hence me including that tag
> when reporbugging), but perhaps Debian can quickfix.

What I have observed so far is that a commit needs to be accepted upstream 
(but doesn't have to have gone through the whole 'chain of command') before a 
temporary patch is accepted to quickly fix it in Debian.
But as said before, I'm going to leave it up to the maintainers on the best 
way to go about fixing this issue.

Cheers,
  Diederik

signature.asc
Description: This is a digitally signed message part.


Bug#1013299: linux-image-4.19.0-20-amd64: NULL pointer deref in qdisc_put() due to missing backport

2022-06-21 Thread Thorsten Glaser
Diederik de Haas dixit:

>In branch 'linux-4.9.y' there is no qdisc_put function, so the above check 
>seems rightly in qdisc_destroy there.

Not any more. Since…

$ git tag --contains 92833e8b5db6
v4.19.221
[…]

… qdisc_destroy was renamed to qdisc_put in 4.19, breaking modules (grr).

So yes, this needs to also be fixed upstream (hence me including that tag
when reporbugging), but perhaps Debian can quickfix.

bye,
//mirabilos
-- 
16:47⎜«mika:#grml» .oO(mira ist einfach gut)  23:22⎜«mikap:#grml»
mirabilos: und dein bootloader ist geil :)23:29⎜«mikap:#grml» und ich
finds saugeil dass ich ein bsd zum booten mit grml hab, das muss ich dann
gleich mal auf usb-stick installieren   -- Michael Prokop über MirOS bsd4grml



Processed: Re: Bug#1013299: linux-image-4.19.0-20-amd64: NULL pointer deref in qdisc_put() due to missing backport

2022-06-21 Thread Debian Bug Tracking System
Processing control commands:

> tag -1 help
Bug #1013299 [src:linux] linux-image-4.19.0-20-amd64: NULL pointer deref in 
qdisc_put() due to missing backport
Added tag(s) help.

-- 
1013299: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1013299
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#1013299: linux-image-4.19.0-20-amd64: NULL pointer deref in qdisc_put() due to missing backport

2022-06-21 Thread Diederik de Haas
Control: tag -1 help

On Tuesday, 21 June 2022 13:35:09 CEST Thorsten Glaser wrote:
> >So I'm inclined to think that 92833e8b5db6c209e9311ac8c6a44d3bf1856659 is
> >the commit which brought the bug back.
> 
> Yes, definitely. The lines…
> 
> if (!qdisc)
> return;
> 
> … from near the beginning of the now-static qdisc_destroy must
> be moved to the beginning of the new qdisc_put function.

Agreed. That would make it in line with 'master' and 'linux-5.10.y'.
In branch 'linux-4.9.y' there is no qdisc_put function, so the above check 
seems rightly in qdisc_destroy there.

This should be reported upstream, but I don't know what the best cq 
appropriate way to do that, so I'm referring that to the actual Debian  
maintainers, hence the 'help' tag.

signature.asc
Description: This is a digitally signed message part.


Processed: Re: Bug#1013299: linux-image-4.19.0-20-amd64: NULL pointer deref in qdisc_put() due to missing backport

2022-06-21 Thread Debian Bug Tracking System
Processing control commands:

> tag -1 - moreinfo
Bug #1013299 [src:linux] linux-image-4.19.0-20-amd64: NULL pointer deref in 
qdisc_put() due to missing backport
Removed tag(s) moreinfo.

-- 
1013299: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1013299
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#1013299: linux-image-4.19.0-20-amd64: NULL pointer deref in qdisc_put() due to missing backport

2022-06-21 Thread Thorsten Glaser
Control: tag -1 - moreinfo

Diederik de Haas dixit:

>In Debian, the release before 4.19.235-1 was 4.19.232-1 which should also have
>this bug. The release before that was 4.19.208-1, which shouldn't.
>Can you verify that?

Not easily any more, but I know it worked some weeks ago, and I *think*
I particularily remember 208 as working. But I do have a clone of linux
on another box and so I can look at ↓

>So I'm inclined to think that 92833e8b5db6c209e9311ac8c6a44d3bf1856659 is
>the commit which brought the bug back.

Yes, definitely. The lines…

if (!qdisc)
return;

… from near the beginning of the now-static qdisc_destroy must
be moved to the beginning of the new qdisc_put function.

bye,
//mirabilos
-- 
Infrastrukturexperte • tarent solutions GmbH
Am Dickobskreuz 10, D-53121 Bonn • http://www.tarent.de/
Telephon +49 228 54881-393 • Fax: +49 228 54881-235
HRB AG Bonn 5168 • USt-ID (VAT): DE122264941
Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg



Bug#1013299: linux-image-4.19.0-20-amd64: NULL pointer deref in qdisc_put() due to missing backport

2022-06-21 Thread Thorsten Glaser
Diederik de Haas dixit:

>It's a bit 'above my paygrade', but if qdisk_put() can accept a NULL pointer
>then I'm curious whether that would be allowed for other functions in that file
>as well ... there are several having "struct Qdisc *qdisc" as (only)
>parameter, but only qdisk_put() checks for NULL.
>That is also true for the current 'master' branch ...

AIUI the check was added because qdisc_destroy() could accept one,
and several qdiscs are using that, it’s like free(3) in that regard,
the other functions aren’t.

bye,
//mirabilos
-- 
Infrastrukturexperte • tarent solutions GmbH
Am Dickobskreuz 10, D-53121 Bonn • http://www.tarent.de/
Telephon +49 228 54881-393 • Fax: +49 228 54881-235
HRB AG Bonn 5168 • USt-ID (VAT): DE122264941
Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg



Re: virtio_balloon regression in 5.19-rc3

2022-06-21 Thread Jason Wang
On Tue, Jun 21, 2022 at 5:24 PM David Hildenbrand  wrote:
>
> On 20.06.22 20:49, Ben Hutchings wrote:
> > I've tested a 5.19-rc3 kernel on top of QEMU/KVM with machine type
> > pc-q35-5.2.  It has a virtio balloon device defined in libvirt as:
> >
> > 
> >> function="0x0"/>
> > 
> >
> > but the virtio_balloon driver fails to bind to it:
> >
> > virtio_balloon virtio4: init_vqs: add stat_vq failed
> > virtio_balloon: probe of virtio4 failed with error -5
> >
>
> Hmm, I don't see any recent changes to drivers/virtio/virtio_balloon.c
>
> virtqueue_add_outbuf() fails with -EIO if I'm not wrong. That's the
> first call of virtqueue_add_outbuf() when virtio_balloon initializes.
>
>
> Maybe something in generic virtio code changed?

Yes, we introduced the IRQ hardening. That could be the root cause and
we've received lots of reports so we decide to disable it by default.

Ben, could you please try this patch: (and make sure
CONFIG_VIRTIO_HARDEN_NOTIFICATION is not set)

https://lore.kernel.org/lkml/20220620024158.2505-1-jasow...@redhat.com/T/

Thanks

>
> --
> Thanks,
>
> David / dhildenb
>



Re: virtio_balloon regression in 5.19-rc3

2022-06-21 Thread David Hildenbrand
On 20.06.22 20:49, Ben Hutchings wrote:
> I've tested a 5.19-rc3 kernel on top of QEMU/KVM with machine type
> pc-q35-5.2.  It has a virtio balloon device defined in libvirt as:
> 
> 
>function="0x0"/>
> 
> 
> but the virtio_balloon driver fails to bind to it:
> 
> virtio_balloon virtio4: init_vqs: add stat_vq failed
> virtio_balloon: probe of virtio4 failed with error -5
> 

Hmm, I don't see any recent changes to drivers/virtio/virtio_balloon.c

virtqueue_add_outbuf() fails with -EIO if I'm not wrong. That's the
first call of virtqueue_add_outbuf() when virtio_balloon initializes.


Maybe something in generic virtio code changed?

-- 
Thanks,

David / dhildenb



Re: virtio_balloon regression in 5.19-rc3

2022-06-21 Thread Thorsten Leemhuis
[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

On 20.06.22 20:49, Ben Hutchings wrote:
> I've tested a 5.19-rc3 kernel on top of QEMU/KVM with machine type
> pc-q35-5.2.  It has a virtio balloon device defined in libvirt as:
> 
> 
>function="0x0"/>
> 
> 
> but the virtio_balloon driver fails to bind to it:
> 
> virtio_balloon virtio4: init_vqs: add stat_vq failed
> virtio_balloon: probe of virtio4 failed with error -5
> 
> On a 5.18 kernel with similar configuration, it binds successfully.
> 
> I've attached the kernel config for 5.19-rc3.

CCing the regression mailing list, as it should be in the loop for all
regressions, as explained here:
https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced v5.18..v5.19-rc3
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replies to), as explained for
in the Linux kernel's documentation; above webpage explains why this is
important for tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.



Bug#1013299: linux-image-4.19.0-20-amd64: NULL pointer deref in qdisc_put() due to missing backport

2022-06-21 Thread Diederik de Haas
On dinsdag 21 juni 2022 11:49:26 CEST you wrote:
> https://lore.kernel.org/all/20190921063607.ga1083...@kroah.com/ is about the
> 4.19.75 release and that contains that change in commit
> 7a1bad565cebfbf6956f9bb36dba734a48fa31d4 titled "net_sched: let qdisc_put()
> accept NULL pointer" which actually modifies the qdisc_destroy function
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/net/sc
> hed/sch_generic.c?h=linux-4.19.y#n1004 OTOH, does indeed not have that NULL
> check.

It's a bit 'above my paygrade', but if qdisk_put() can accept a NULL pointer 
then I'm curious whether that would be allowed for other functions in that file 
as well ... there are several having "struct Qdisc *qdisc" as (only) 
parameter, but only qdisk_put() checks for NULL.
That is also true for the current 'master' branch ...

signature.asc
Description: This is a digitally signed message part.


Processed: Re: Bug#1013299: linux-image-4.19.0-20-amd64: NULL pointer deref in qdisc_put() due to missing backport

2022-06-21 Thread Debian Bug Tracking System
Processing control commands:

> tag -1 moreinfo
Bug #1013299 [src:linux] linux-image-4.19.0-20-amd64: NULL pointer deref in 
qdisc_put() due to missing backport
Added tag(s) moreinfo.

-- 
1013299: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1013299
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#1013299: linux-image-4.19.0-20-amd64: NULL pointer deref in qdisc_put() due to missing backport

2022-06-21 Thread Diederik de Haas
Control: tag -1 moreinfo

On Tuesday, 21 June 2022 08:10:54 CEST Thorsten Glaser wrote:
> Package: src:linux
> Version: 4.19.235-1
> Severity: critical
> Tags: upstream
> Justification: breaks the whole system
> 
> A recent upstream “stable” upgrade backported the removal of the
> qdisc_destroy() function (which, in itself, is questionable enough
> already and caused no small amount of fun) using qdisc_put() instead.
> 
> However, qdisc_put() does not accept NULL pointers, causing oopses
> in several qdiscs that can be configured on a system. This breaks
> sudo (su works), networking and even deconfiguration is not possible,
> only /proc/sysrq-trigger makes it possible to recover.
> 
> https://www.mail-archive.com/netdev@vger.kernel.org/msg314288.html
> fixes this but was not backported along.

https://lore.kernel.org/all/20190921063607.ga1083...@kroah.com/ is about the 
4.19.75 release and that contains that change in commit 
7a1bad565cebfbf6956f9bb36dba734a48fa31d4 titled "net_sched: let qdisc_put() 
accept NULL pointer" which actually modifies the qdisc_destroy function

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/net/sched/sch_generic.c?h=linux-4.19.y#n1004
OTOH, does indeed not have that NULL check.

In commit 92833e8b5db6c209e9311ac8c6a44d3bf1856659, part of v4.19.221, titled
"net: sched: rename qdisc_destroy() to qdisc_put()" wasn't actually a straight
rename, but some code moved around into a new qdisc_put function, 
which doesn't have the NULL check.

In Debian, the release before 4.19.235-1 was 4.19.232-1 which should also have
this bug. The release before that was 4.19.208-1, which shouldn't.
Can you verify that?

So I'm inclined to think that 92833e8b5db6c209e9311ac8c6a44d3bf1856659 is
the commit which brought the bug back.

signature.asc
Description: This is a digitally signed message part.


Processed: Fixing fixed version

2022-06-21 Thread Debian Bug Tracking System
Processing commands for cont...@bugs.debian.org:

> notfixed 918097 0.141
Bug #918097 {Done: Diederik de Haas } 
[initramfs-tools-core] mkinitramfs prints error messages when no modules are 
needed in the initramfs
No longer marked as fixed in versions 0.141.
> fixed 918097 initramfs-tools/0.141
Bug #918097 {Done: Diederik de Haas } 
[initramfs-tools-core] mkinitramfs prints error messages when no modules are 
needed in the initramfs
Marked as fixed in versions initramfs-tools/0.141.
>
End of message, stopping processing here.

Please contact me if you need assistance.
-- 
918097: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=918097
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#918097: marked as done (mkinitramfs prints error messages when no modules are needed in the initramfs)

2022-06-21 Thread Debian Bug Tracking System
Your message dated Tue, 21 Jun 2022 10:54:26 +0200
with message-id <12000826.O9o76ZdvQC@bagend>
and subject line Re: Bug#918097: initramfs-tools-core: Error while building 
DKMS modules when kernel has all its modules built in
has caused the Debian Bug report #918097,
regarding mkinitramfs prints error messages when no modules are needed in the 
initramfs
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)


-- 
918097: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=918097
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems
--- Begin Message ---
Package: initramfs-tools-core
Version: 0.132
Severity: normal

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Dear Maintainer,

I wanted to build a custom kernel using the linux-source package from the
Debian repository. I haven't changed much, the only thing I wanted to do is to
build in all the modules my machine needs and remove all the others. Such
kernels don't have modules in *.ko files and it looks like mkinitramfs has some
issues with that.

Basically, the problem is in the /usr/share/initramfs-tools/hook-functions
script file with the following line:

find "${DESTDIR}/lib/modules/${version}/kernel" -name '*.ko*'

Since none of the modules listed above in the file were copied to the temp
destination folder, the kernel/ subdir doesn't exist and hence the above
command returns error. The initramfs image is building fine, but DKMS packages
return something similar to the following:

# dpkg --configure -a
Setting up sysdig-dkms (0.24.1-1) ...
Removing old sysdig-0.24.1 DKMS files...

-  Uninstall Beginning 
Module:  sysdig
Version: 0.24.1
Kernel:  4.19.13-amd64-morficzny (x86_64)
- -

Status: Before uninstall, this module version was ACTIVE on this kernel.

sysdig-probe.ko:
 - Uninstallation
   - Deleting from: /lib/modules/4.19.13-amd64-morficzny/updates/dkms/
 - Original module
   - No original module was found for this module on this kernel.
   - Use the dkms install command to reinstall any previous module version.

depmod...

DKMS: uninstall completed.

- --
Deleting module version: 0.24.1
completely from the DKMS tree.
- --
Done.
Loading new sysdig-0.24.1 DKMS files...
Building for 4.19.13-amd64-morficzny 4.20.0-amd64-morficzny
Building initial module for 4.19.13-amd64-morficzny
Error! Bad return status for module build on kernel: 4.19.13-amd64-morficzny
(x86_64)
Consult /var/lib/dkms/sysdig/0.24.1/build/make.log for more information.
dpkg: error processing package sysdig-dkms (--configure):
 installed sysdig-dkms package post-installation script subprocess returned
error exit status 10
Errors were encountered while processing:
 sysdig-dkms

I changed my kernel config a little bit:

# egrep \=m /boot/config-4.19.13-amd64-morficzny
CONFIG_BTRFS_FS=m
CONFIG_XOR_BLOCKS=m
CONFIG_RAID6_PQ=m

And after rebuilding the kernel when I want to created the initramfs image I
can see the the kernel/ subdir:

# tree
/var/tmp/mkinitramfs_p7rDVj/usr/lib/modules/4.19.13-amd64-morficzny/kernel/
/var/tmp/mkinitramfs_p7rDVj/usr/lib/modules/4.19.13-amd64-morficzny/kernel/
├── crypto
│   └── xor.ko
├── fs
│   └── btrfs
│   └── btrfs.ko
└── lib
└── raid6
└── raid6_pq.ko

5 directories, 3 files

Will this qualify as bug or should I fix this in some way myself since it's not
the Debian distribution's kernel?



- -- System Information:
Debian Release: buster/sid
  APT prefers unstable
  APT policy: (990, 'unstable'), (130, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 4.19.13-amd64-morficzny (SMP w/2 CPU cores; PREEMPT)
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8),
LANGUAGE=en_US.utf8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages initramfs-tools-core depends on:
ii  coreutils8.30-1
ii  cpio 2.12+dfsg-6
ii  e2fsprogs1.44.5-1
ii  klibc-utils  2.0.4-14
ii  kmod 25-2
ii  udev 240-2

Versions of packages initramfs-tools-core recommends:
ii  busybox  1:1.27.2-3

Versions of packages initramfs-tools-core suggests:
ii  bash-completion  1:2.8-5




-BEGIN PGP SIGNATURE-

iQIzBAEBCgAdFiEE5JPPWm5C7TFDUMqpzQRoEHcbZSAFAlwt2AAACgkQzQRoEHcb
ZSDrjg//X6cMtoFjEw3BOAg+1EpKTQ1/yfJSAW611NgsgfvvPurncPx5X3XtQ0sH
/I47mXSB4ds3/KuGIJb96SXgOChBqvfdS/7ajirGL11Ou4Ujfmb9CC0sOpUe3uba
OKNdB+QiwHMDlfKdvDMk0LXN4i7fx9hCUukAkhE1s9xbqCk+veTmHpnv5BomIoX/
AqfdmeN2Y08MDdd5K8S/g1/4fVg9QUSE8+9dP9r1Bw7nqdp4EpSj7ZSKtifSayRb

Bug#1013299: linux-image-4.19.0-20-amd64: NULL pointer deref in qdisc_put() due to missing backport

2022-06-21 Thread Thorsten Glaser
Package: src:linux
Version: 4.19.235-1
Severity: critical
Tags: upstream
Justification: breaks the whole system

A recent upstream “stable” upgrade backported the removal of the
qdisc_destroy() function (which, in itself, is questionable enough
already and caused no small amount of fun) using qdisc_put() instead.

However, qdisc_put() does not accept NULL pointers, causing oopses
in several qdiscs that can be configured on a system. This breaks
sudo (su works), networking and even deconfiguration is not possible,
only /proc/sysrq-trigger makes it possible to recover.

https://www.mail-archive.com/netdev@vger.kernel.org/msg314288.html
fixes this but was not backported along.


-- Package-specific info:
** Version:
Linux version 4.19.0-20-amd64 (debian-kernel@lists.debian.org) (gcc version 
8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.235-1 (2022-03-17)

** Command line:
BOOT_IMAGE=/boot/vmlinuz-4.19.0-20-amd64 root=/dev/vda2 ro net.ifnames=0 
nomodeset

** Not tainted

** Kernel log:
Unable to read kernel log; any relevant messages should be attached

** Model information
sys_vendor: QEMU
product_name: Standard PC (i440FX + PIIX, 1996)
product_version: pc-i440fx-2.8
chassis_vendor: QEMU
chassis_version: pc-i440fx-2.8
bios_vendor: SeaBIOS
bios_version: 1.14.0-2

** Loaded modules:
ipt_MASQUERADE
nf_conntrack_netlink
xfrm_user
xfrm_algo
nft_counter
nft_chain_nat_ipv4
nf_nat_ipv4
xt_addrtype
nft_compat
xt_conntrack
x_tables
nf_nat
nf_conntrack
nf_defrag_ipv6
nf_defrag_ipv4
libcrc32c
br_netfilter
bridge
stp
llc
nf_tables
devlink
nfnetlink
overlay
nfsd
auth_rpcgss
nfs_acl
nfs
lockd
grace
fscache
sunrpc
loop
kvm_intel
ttm
kvm
drm_kms_helper
irqbypass
virtio_rng
joydev
drm
evdev
rng_core
virtio_balloon
serio_raw
pcspkr
button
qemu_fw_cfg
ext4
crc16
mbcache
jbd2
crc32c_generic
fscrypto
ecb
crypto_simd
cryptd
glue_helper
aes_x86_64
hid_generic
usbhid
hid
virtio_net
net_failover
failover
virtio_blk
ata_generic
ata_piix
uhci_hcd
libata
ehci_hcd
usbcore
psmouse
usb_common
virtio_pci
virtio_ring
i2c_piix4
crc32c_intel
scsi_mod
virtio
floppy

** PCI devices:
00:00.0 Host bridge [0600]: Intel Corporation 440FX - 82441FX PMC [Natoma] 
[8086:1237] (rev 02)
Subsystem: Red Hat, Inc Qemu virtual machine [1af4:1100]
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- 
SERR- TAbort- 
SERR- TAbort- SERR- TAbort- 
SERR- TAbort- SERR- TAbort- SERR- 
Kernel driver in use: virtio-pci
Kernel modules: virtio_pci

00:04.0 SCSI storage controller [0100]: Red Hat, Inc Virtio block device 
[1af4:1001]
Subsystem: Red Hat, Inc Virtio block device [1af4:0002]
Physical Slot: 4
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: virtio-pci
Kernel modules: virtio_pci

00:05.0 Unclassified device [00ff]: Red Hat, Inc Virtio memory balloon 
[1af4:1002]
Subsystem: Red Hat, Inc Virtio memory balloon [1af4:0005]
Physical Slot: 5
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: virtio-pci
Kernel modules: virtio_pci

00:06.0 Unclassified device [00ff]: Red Hat, Inc Virtio RNG [1af4:1005]
Subsystem: Red Hat, Inc Virtio RNG [1af4:0004]
Physical Slot: 6
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: virtio-pci
Kernel modules: virtio_pci

00:07.0 Ethernet controller [0200]: Red Hat, Inc Virtio network device 
[1af4:1000]
Subsystem: Red Hat, Inc Virtio network device [1af4:0001]
Physical Slot: 7
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: virtio-pci
Kernel modules: virtio_pci


** USB devices:
not available


-- System Information:
Debian Release: 10.12
  APT prefers oldstable-updates
  APT policy: (500, 'oldstable-updates'), (500, 'oldstable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.19.0-20-amd64 (SMP w/3 CPU cores)
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE=C.UTF-8 
(charmap=UTF-8)
Shell: /bin/sh linked to /bin/lksh
Init: sysvinit (via /sbin/init)

Versions of packages linux-image-4.19.0-20-amd64 depends on:
ii  initramfs-tools [linux-initramfs-tool]  0.133+deb10u1
ii  kmod26-1
ii  linux-base  4.6

Versions of packages