Bug#1029968: Info received (Bug#1029968: Info received (Bug#1029968: Acknowledgement (bttv/v4l: WARNING: CPU: 6 PID: 6164 at mm/vmalloc.c:487 __vmap_pages_range_noflush+0x3e0/0x4d0)))

2023-02-01 Thread Dr. David Alan Gilbert
Confirmed still happens on upstream 6.2.0-rc6
-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert|   Running GNU/Linux   | Happy  \ 
\dave @ treblig.org |   | In Hex /
 \ _|_ http://www.treblig.org   |___/



Bug#1029968: Acknowledgement (bttv/v4l: WARNING: CPU: 6 PID: 6164 at mm/vmalloc.c:487 __vmap_pages_range_noflush+0x3e0/0x4d0)

2023-02-01 Thread Dr. David Alan Gilbert
* Diederik de Haas (didi.deb...@cknow.org) wrote:

> Thanks for that thorough analyses!

Thanks for the reply,

> If you're 'penguin42' on IRC,

Yep, that's me.

> then I'd suggest to present your findings to
> io...@lists.linux.dev as both the author and the reviewer are highly likely 
> subscribed to that list.
> 
> scripts/get_maintainer.pl drivers/iommu/dma-iommu.c
> scripts/get_maintainer.pl kernel/dma/Makefile
> 
> list them both and both results have also that ML in their result.

Yep, will do; I'm just going to try a 6.2rc as well just in case it's
got fixed very recently, and have a poke about in case I can see
any obvious cause now I know the change that triggered it.
I'll include the linux-media list as well since it's just as likely
that it's a fault on the v4l/bttv driver.

Dave

> HTH


-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert|   Running GNU/Linux   | Happy  \ 
\dave @ treblig.org |   | In Hex /
 \ _|_ http://www.treblig.org   |___/



Bug#1029968: Acknowledgement (bttv/v4l: WARNING: CPU: 6 PID: 6164 at mm/vmalloc.c:487 __vmap_pages_range_noflush+0x3e0/0x4d0)

2023-01-31 Thread Diederik de Haas
On Wednesday, 1 February 2023 02:52:13 CET Dr. David Alan Gilbert wrote:
> bisected:
> GOOD [37fcacb50be7071d146144a6c5c5bf0194b9a1cf] phy: PHY_FSL_LYNX_28G should
> depend on ARCH_LAYERSCAPE BAD [f5ff79fddf0efecca538046b5cc20fb3ded2ec4f]
> dma-mapping: remove CONFIG_DMA_REMAP GOOD
> [e62c17f0455a74b182ce6373e2777817256afaa1] MAINTAINERS: update maintainer
> list of DMA MAPPING BENCHMARK GOOD
> [0fb3436b4b36cf69f4544385aa2bb8c5a4913509] sparc: Remove usage of the
> deprecated "pci-dma-compat.h" API GOOD
> [fba09099c6e506608e05e08ac717bf34501f821b] media: v4l2-pci-skeleton: Remove
> usage of the deprecated "pci-dma-compat.h" API
> 
> dg@major:~/kernel/kernel-clone$ git bisect good
> f5ff79fddf0efecca538046b5cc20fb3ded2ec4f is the first bad commit
> commit f5ff79fddf0efecca538046b5cc20fb3ded2ec4f
> Author: Christoph Hellwig 
> Date:   Sat Feb 26 16:40:21 2022 +0100
> 
> dma-mapping: remove CONFIG_DMA_REMAP
> 
> That sounds like a believable cause given that it's IOMMU related
> and device related.

Thanks for that thorough analyses!
If you're 'penguin42' on IRC, then I'd suggest to present your findings to
io...@lists.linux.dev as both the author and the reviewer are highly likely 
subscribed to that list.

scripts/get_maintainer.pl drivers/iommu/dma-iommu.c
scripts/get_maintainer.pl kernel/dma/Makefile

list them both and both results have also that ML in their result.

HTH

signature.asc
Description: This is a digitally signed message part.


Bug#1029968: Info received (Bug#1029968: Acknowledgement (bttv/v4l: WARNING: CPU: 6 PID: 6164 at mm/vmalloc.c:487 __vmap_pages_range_noflush+0x3e0/0x4d0))

2023-01-31 Thread Dr. David Alan Gilbert
Note that the oops at this bisect point is messier than on the newer
kernels; the newer kernels hit a WARN in __vmap_pages_range_noflush, where
as at this point it slams into a BUG, but the rest of the backtrace is similar;

[   78.988024] BUG: unable to handle page fault for address: bd7fc110
[   78.988033] #PF: supervisor write access in kernel mode
[   78.988036] #PF: error_code(0x000b) - reserved bit violation
[   78.988038] PGD 10067 P4D 10067 PUD 1001a6067 PMD 22b791067 PTE 
8000270800cb9063
[   78.988046] Oops: 000b [#1] PREEMPT SMP PTI
[   78.988050] CPU: 7 PID: 879 Comm: cat Tainted: G  I   
5.17.0-rc1dg+ #20
[   78.988054] Hardware name: To Be Filled By O.E.M. To Be Filled By 
O.E.M./P55M Pro, BIOS P1.50 09/10/2009
[   78.988056] RIP: 0010:__memset+0x24/0x30
[   78.988063] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 
07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6  48 ab 
89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3
[   78.988067] RSP: 0018:bd7fc0cc7d50 EFLAGS: 00010206
[   78.988071] RAX:  RBX: 9679eb788a88 RCX: 2000
[   78.988073] RDX:  RSI:  RDI: bd7fc110
[   78.988076] RBP: 8000 R08:  R09: bd7fc110
[   78.988079] R10: 0001 R11: bd7fc110 R12: 0010
[   78.988081] R13: 0010 R14: 9679ec8e4130 R15: 0010
[   78.988084] FS:  7f91fb65a740() GS:9679efdc() 
knlGS:
[   78.988087] CS:  0010 DS:  ES:  CR0: 80050033
[   78.988090] CR2: bd7fc110 CR3: 00022ce02000 CR4: 06e0
[   78.988093] Call Trace:
[   78.988096]  
[   78.988100]  __videobuf_iolock+0x5cd/0x659 [videobuf_dma_sg]
[   78.988110]  vbi_buffer_prepare+0x1aa/0x2b0 [bttv]
[   78.988125]  __videobuf_read_start+0xb9/0x1d0 [videobuf_core]
[   78.988133]  videobuf_read_stream+0x2cb/0x330 [videobuf_core]
[   78.988140]  bttv_read+0xc5/0x1d0 [bttv]
[   78.988151]  v4l2_read+0x6b/0x80 [videodev]
[   78.988169]  vfs_read+0x97/0x190
[   78.988175]  ksys_read+0x63/0xe0
[   78.988179]  do_syscall_64+0x3a/0x80
[   78.988185]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   78.988191] RIP: 0033:0x7f91fb7550ed
[   78.988194] Code: 31 c0 e9 c6 fe ff ff 50 48 8d 3d f6 54 0a 00 e8 39 fe 01 
00 66 0f 1f 84 00 00 00 00 00 80 3d f1 24 0e 00 00 74 17 31 c0 0f 05 <48> 3d 00 
f0 ff ff 77 5b c3 66 2e 0f 1f 84 00 00 00 00 00 48 83 ec
[   78.988198] RSP: 002b:7ffdad058348 EFLAGS: 0246 ORIG_RAX: 

[   78.988202] RAX: ffda RBX: 0002 RCX: 7f91fb7550ed
[   78.988204] RDX: 0002 RSI: 7f91fb35 RDI: 0003
[   78.988207] RBP: 0002 R08:  R09: 
[   78.988209] R10: 7f91fb66cb40 R11: 0246 R12: 7f91fb35
[   78.988212] R13: 0003 R14: 0002 R15: 
[   78.988216]  
[   78.988218] Modules linked in: rfkill qrtr tuner_simple tuner_types 
intel_powerclamp coretemp tuner snd_hda_codec_via tda7432 kvm_intel 
snd_hda_codec_generic snd_hda_codec_hdmi tvaudio ledtrig_audio msp3400 
snd_hda_intel kvm snd_intel_dspcfg bttv snd_intel_sdw_acpi snd_hda_codec 
irqbypass tea575x snd_hda_core tveeprom videobuf_dma_sg intel_cstate 
videobuf_core videodev snd_hwdep snd_bt87x intel_uncore snd_pcm serio_raw 
pcspkr iTCO_wdt i7core_edac mc joydev intel_pmc_bxt snd_timer 
iTCO_vendor_support snd watchdog soundcore sg evdev acpi_cpufreq firewire_sbp2 
fuse msr configfs efi_pstore ip_tables x_tables autofs4 ext4 crc32c_generic 
crc16 mbcache jbd2 nouveau sd_mod t10_pi crc_t10dif crct10dif_generic 
ata_generic crct10dif_common hid_generic usbhid hid mxm_wmi wmi video 
i2c_algo_bit drm_ttm_helper ttm drm_kms_helper cec rc_core drm ahci libahci 
pata_via libata ehci_pci ehci_hcd psmouse r8169 usbcore scsi_mod crc32c_intel 
realtek i2c_i801 mdio_devres i2c_smbus libphy firewire_ohci
[   78.988301]  firewire_core lpc_ich usb_common scsi_common crc_itu_t button
[   78.988311] CR2: bd7fc110
[   78.988314] ---[ end trace  ]---
[   78.988316] RIP: 0010:__memset+0x24/0x30
[   78.988320] Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 
07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6  48 ab 
89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3
[   78.988324] RSP: 0018:bd7fc0cc7d50 EFLAGS: 00010206
[   78.988327] RAX:  RBX: 9679eb788a88 RCX: 2000
[   78.988329] RDX:  RSI:  RDI: bd7fc110
[   78.988332] RBP: 8000 R08:  R09: bd7fc110
[   78.988334] R10: 0001 R11: bd7fc110 R12: 0010
[   78.988337] R13: 0010 R14: 9679ec8e4130 R15: 0010
[   78.988340] FS:  7f91fb65a740() GS:9679efdc00

Bug#1029968: Acknowledgement (bttv/v4l: WARNING: CPU: 6 PID: 6164 at mm/vmalloc.c:487 __vmap_pages_range_noflush+0x3e0/0x4d0)

2023-01-31 Thread Dr. David Alan Gilbert
bisected:
GOOD [37fcacb50be7071d146144a6c5c5bf0194b9a1cf] phy: PHY_FSL_LYNX_28G should 
depend on ARCH_LAYERSCAPE
BAD [f5ff79fddf0efecca538046b5cc20fb3ded2ec4f] dma-mapping: remove 
CONFIG_DMA_REMAP
GOOD [e62c17f0455a74b182ce6373e2777817256afaa1] MAINTAINERS: update maintainer 
list of DMA MAPPING BENCHMARK
GOOD [0fb3436b4b36cf69f4544385aa2bb8c5a4913509] sparc: Remove usage of the 
deprecated "pci-dma-compat.h" API
GOOD [fba09099c6e506608e05e08ac717bf34501f821b] media: v4l2-pci-skeleton: 
Remove usage of the deprecated "pci-dma-compat.h" API

dg@major:~/kernel/kernel-clone$ git bisect good
f5ff79fddf0efecca538046b5cc20fb3ded2ec4f is the first bad commit
commit f5ff79fddf0efecca538046b5cc20fb3ded2ec4f
Author: Christoph Hellwig 
Date:   Sat Feb 26 16:40:21 2022 +0100

dma-mapping: remove CONFIG_DMA_REMAP

That sounds like a believable cause given that it's IOMMU related
and device related.

-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert|   Running GNU/Linux   | Happy  \ 
\dave @ treblig.org |   | In Hex /
 \ _|_ http://www.treblig.org   |___/



Bug#1029968: Acknowledgement (bttv/v4l: WARNING: CPU: 6 PID: 6164 at mm/vmalloc.c:487 __vmap_pages_range_noflush+0x3e0/0x4d0)

2023-01-30 Thread Dr. David Alan Gilbert
Upstream 5.17 works
Upstream 5.18 fails

(with intel_iommu=on)

Let the bisect begin.

-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert|   Running GNU/Linux   | Happy  \ 
\dave @ treblig.org |   | In Hex /
 \ _|_ http://www.treblig.org   |___/



Bug#1029968: Acknowledgement (bttv/v4l: WARNING: CPU: 6 PID: 6164 at mm/vmalloc.c:487 __vmap_pages_range_noflush+0x3e0/0x4d0)

2023-01-30 Thread Dr. David Alan Gilbert
This is IOMMU related.

Upstream 6.1 and 5.18 *do* exhibit the bug, but only with intel_iommu=on
where as Debian seems to default it to on.

-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert|   Running GNU/Linux   | Happy  \ 
\dave @ treblig.org |   | In Hex /
 \ _|_ http://www.treblig.org   |___/



Bug#1029968: Acknowledgement (bttv/v4l: WARNING: CPU: 6 PID: 6164 at mm/vmalloc.c:487 __vmap_pages_range_noflush+0x3e0/0x4d0)

2023-01-29 Thread Dr. David Alan Gilbert
I built upstream kernels 5.18.0 and 6.1.0 and both of them work for me.
Which makes life much more painful to find.

-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert|   Running GNU/Linux   | Happy  \ 
\dave @ treblig.org |   | In Hex /
 \ _|_ http://www.treblig.org   |___/



Bug#1029968: Acknowledgement (bttv/v4l: WARNING: CPU: 6 PID: 6164 at mm/vmalloc.c:487 __vmap_pages_range_noflush+0x3e0/0x4d0)

2023-01-29 Thread Dr. David Alan Gilbert
   WORKS 
https://snapshot.debian.org/archive/debian/20220601T031637Z/pool/main/l/linux-signed-amd64/linux-image-5.17.0-1-amd64_5.17.3-1_amd64.deb
   5.17.0-1-amd64 #1 SMP PREEMPT Debian 5.17.3-1


So I think it's time to move upstream and bisect between 5.17 and 5.18

Dave

-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert|   Running GNU/Linux   | Happy  \ 
\dave @ treblig.org |   | In Hex /
 \ _|_ http://www.treblig.org   |___/