Re: [syzbot] [hfs?] WARNING in hfs_write_inode
On Thu, 20 Jul 2023 at 15:37, Matthew Wilcox wrote: > > I think you're missing the context. There are bugs in how this filesystem > handles intentionally-corrupted filesystems. That's being reported as > a critical bug because apparently some distributions automount HFS/HFS+ > filesystems presented to them on a USB key. Nobody is being paid to fix > these bugs. Nobody is volunteering to fix these bugs out of the kindness > of their heart. What choice do we have but to remove the filesystem, > regardless of how many happy users it has? You're being silly. We have tons of sane options. The obvious one is "just don't mount untrusted media". Now, the kernel doesn't know which media is trusted or not, since the kernel doesn't actually see things like /etc/mtab and friends. So we in the kernel can't do that, but distros should have a very easy time just fixing their crazy models. Saying that the kernel should remove a completely fine filesystem just because some crazy use-cases that nobody cares about are broken, now *that* just crazy. Now, would it be good to have a maintainer for hgs? Obviously. But no, we don't remove filesystems just because they don't have maintainers. And no, we have not suddenly started saying "users don't matter". Linus
Re: Boot regression in Linux v6.4-rc3
On Tue, May 30, 2023 at 4:21 PM Konstantin Ryabitsev wrote: > > We only add things to lore when someone asks, and nobody's asked. :) I guess > I'll consider this an ask and put it on the radar. Thanks. It would probably be good to see if there are any other vger.kernel.org lists with any appreciable traffic that aren't on lore. Linus
Re: Boot regression in Linux v6.4-rc3
On Sat, May 27, 2023 at 11:41 AM Frank Scheiner wrote: > > Ok, I put the decoded console messages on [2]. > > [2]: https://pastebin.com/dLYMijfS Ugh. Apparently ia64 decoding isn't great. But at least it gives multiple line numbers: load_module (kernel/module/main.c:2291 kernel/module/main.c:2412 kernel/module/main.c:2868) except your kernel obviously has those test-patches, so I still don't know exactly where they are. But it looks like it is in move_module(). Strange. I don't know how it gets to "__copy_user" from there... [ Looks at the ia64 code ] Oh. It turns out that it *says* __copy_user(), but the code is actually shared with the regular memcpy() function, which does GLOBAL_ENTRY(memcpy) and r28=0x7,in0 and r29=0x7,in1 mov f6=f0 mov retval=in0 br.cond.sptk .common_code ;; where that ".common_code" label is - surprise surprise - the common copy code, and so when the oops reports that the problem happened in __copy_user(), it actually is in this case just a normal memcpy. Ok, so it's probably the memcpy(dest, (void *)shdr->sh_addr, shdr->sh_size); in move_module() that takes a fault. And looking at the registers, the destination is in r17/r18, and your dump has unable to handle kernel paging request at virtual address 1000 ... r17 : 0fff r18 : 1000 so it's almost certainly that 'dest' that is bad. Which I guess shouldn't surprise anybody. But that's where my knowledge of ia64 and the new module loader layout ends. Linus
Re: Boot regression in Linux v6.4-rc3
On Sat, May 27, 2023 at 12:01 AM Frank Scheiner wrote: > > If it is of any help, my initial report is available for example via: > > https://marc.info/?l=linux-ia64=168509859125505=2 > > ...the whole thread is currently at: > > https://marc.info/?t=16850986823=1=2 This does make it clear just how great a mailing list archive lore is. Konstantin, is there any particular reason why linux-i...@vger.kernel.org isn't in lore? Is it just a rational hatred of all things itanium? Anyway, the WARN_ON() is likely related, but the bug is clearly an unexpected page fault in __copy_user() when called by load_module(). The ia64 oops output is nasty, presumably because ia64 aggressively inlines things. It would help a lot if you enabled debug info (maybe you already do?) and then run the oops through ./scripts/decode_stacktrace.sh which will figure out line numbers, inlining etc. Because I don't even see why it would call __copy_user() in the first place. This is 'finit_module()' that loads the module data from a file, not user space. So I guess it must be the strndup_user() in mod->args = strndup_user(uargs, ~0UL >> 1); but that doesn't look like it should even care about any module layout. Plus I would have expected to see strndup_user() in the call trace, but whatever. End result: that ia64 trace is very hard to read, and _maybe_ running it through the decode script might give more information about what it is that triggers... Linus
Re: Boot regression in Linux v6.4-rc3
On Fri, May 26, 2023 at 2:59 PM Luis Chamberlain wrote: > > Not saying that debugging commit ac3b4328392344 ("module: replace > module_layout with module_memory") is going to be impossible, quite > the contrary I think it would be good to root cause it, if possible, > as perhaps it may also be similar to some other future oddball arch > bug later that may come up. I don't have any context - the mailing lists in question that apparently this came in on aren't in lore. That said, that commit looks odd for the ia64 part. In particular, this part: - if (mod->core_layout.size > MAX_LTOFF) + struct module_memory *mod_mem; + + mod_mem = >mem[MOD_DATA]; in apply_relocate_add() (file: arch/ia64/kernel/module.c) seems suspect. The previous place that used to look at "mod->core_layout.base" converted that to "mod->mem[MOD_TEXT].base". As do other changes in other architectures. So that "MOD_DATA" looks *very* wrong. Shouldn't core_layout. be translated to use "MOD_TEXT" instead? Nothing else in the ia64 parts strike me as odd, but that one looks wrong to me. But this is my "monkey see, monkey do" pattern matching reaction, not from any deeper understanding of the problem (I can't even see the report) or really even the code. Linus