On Thu, 8 Feb 2024 at 11:55, Borislav Petkov <[email protected]> wrote: > > On Thu, Feb 08, 2024 at 08:47:37AM +0100, Marco Elver wrote: > > That's a good question, and I don't have the answer to that - maybe we > > need to ask Linus then. > > Right, before that, lemme put my user hat on. > > > We could argue that to improve memory safety of the Linux kernel more > > rapidly, enablement of KFENCE by default (on the "big" architectures > > like x86) might actually be a net benefit at ~zero performance > > overhead and the cost of 2 MiB of RAM (default config). > > What about its benefit? > > I haven't seen a bug fix saying "found by KFENCE" or so but that doesn't > mean a whole lot.
git log --grep 'BUG: KFENCE: ' There are more I'm aware of - also plenty I know of in downstream kernels (https://arxiv.org/pdf/2311.09394.pdf - Section 5.7). > The more important question is would I, as a user, have a way of > reporting such issues, would those issues be taken seriously and so on. This is a problem shared by all other diagnostic and error reports the kernel produces. > We have a whole manual about it: > > Documentation/admin-guide/reporting-issues.rst > > maybe the kfence splat would have a pointer to that? Perhaps... > > Personally, I don't mind running it if it really is a ~zero overhead > KASAN replacement. Maybe as a preliminary step we should enable it on > devs machines who know how to report such things. It's not a KASAN replacement, since it's sampling based. From the Documentation: "KFENCE is designed to be enabled in production kernels, and has near zero performance overhead. Compared to KASAN, KFENCE trades performance for precision. The main motivation behind KFENCE's design, is that with enough total uptime KFENCE will detect bugs in code paths not typically exercised by non-production test workloads. One way to quickly achieve a large enough total uptime is when the tool is deployed across a large fleet of machines." Enabling it in as many kernels as possible will help towards the "deployed across a large fleet of machines". That being said, KFENCE is already deployed across O(millions) of devices where the reporting story is also taken care of. Enabling it in even more systems where the reporting story is not as clear may or may not be helpful - it'd be an experiment. > /me goes and enables it in a guest... > > [ 0.074294] kfence: initialized - using 2097152 bytes for 255 objects at > 0xffff88807d600000-0xffff88807d800000 > > Guest looks ok to me, no reports. > > What now? :-) No reports are good. Doesn't mean absence of bugs though. :-) Thanks, -- Marco
