On Thu, 12 Jun 2025 at 05:47, Masami Hiramatsu <mhira...@kernel.org> wrote: > > On Wed, 11 Jun 2025 13:30:01 +0200 > Peter Zijlstra <pet...@infradead.org> wrote: > > > On Tue, Jun 10, 2025 at 11:47:48PM +0900, Masami Hiramatsu (Google) wrote: > > > From: Masami Hiramatsu (Google) <mhira...@kernel.org> > > > > > > Invalidate the cache after replacing INT3 with the new instruction. > > > This will prevent the other CPUs seeing the removed INT3 in their > > > cache after serializing the pipeline. > > > > > > LKFT reported an oops by INT3 but there is no INT3 shown in the > > > dumped code. This means the INT3 is removed after the CPU hits > > > INT3. > > > > > > ## Test log > > > ftrace-stress-test: <12>[ 21.971153] /usr/local/bin/kirk[277]: > > > starting test ftrace-stress-test (ftrace_stress_test.sh 90) > > > <4>[ 58.997439] Oops: int3: 0000 [#1] SMP PTI > > > <4>[ 58.998089] CPU: 0 UID: 0 PID: 323 Comm: sh Not tainted > > > 6.15.0-next-20250605 #1 PREEMPT(voluntary) > > > <4>[ 58.998152] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), > > > BIOS 1.16.3-debian-1.16.3-2 04/01/2014 > > > <4>[ 58.998260] RIP: 0010:_raw_spin_lock+0x5/0x50 > > > <4>[ 58.998563] Code: 5d e9 ff 12 00 00 66 66 2e 0f 1f 84 00 00 00 > > > 00 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 > > > 0f 1e fa 0f <1f> 44 00 00 55 48 89 e5 53 48 89 fb bf 01 00 00 00 e8 15 > > > 12 e4 fe > > > > > > Maybe one possible scenario is to hit the int3 after the third step > > > somehow (on I-cache). > > > > > > ------ > > > <CPU0> <CPU1> > > > Start smp_text_poke_batch_finish(). > > > Start the third step. (remove INT3) > > > on_each_cpu(do_sync_core) > > > do_sync_core(do SERIALIZE) > > > Finish the third step. > > > Hit INT3 (from I-cache) > > > Clear text_poke_array_refs[cpu0] > > > Start smp_text_poke_int3_handler() > > > Failed to get text_poke_array_refs[cpu0] > > > Oops: int3 > > > ------ > > > > > > SERIALIZE instruction flashes pipeline, thus the processor needs > > > to reload the instruction. But it is not ensured to reload it from > > > memory because SERIALIZE does not invalidate the cache. > > > > > > To prevent reloading replaced INT3, we need to invalidate the cache > > > (flush TLB) in the third step, before the do_sync_core(). > > > > This sounds all sorts of wrong. x86 is supposed to be cache-coherent. A > > store should cause the invalidation per MESI and all that. This means > > the only place where the old instruction can stick around is in the > > uarch micro-ops cache and all that, and SERIALIZE will very much flush > > those. > > OK, thanks for pointing it out! > > > > > Also, TLB flush != I$ flush. There is clflush_cache_range() for this. > > But still, this really should not be needed. > > > > Also, this is all qemu, and qemu is known to have gotten this terribly > > wrong in the past. > > What about KVM? We need to ask Naresh how it is running on the machine. > Naresh, can you tell us how the VM is running? Does that use KVM? > And if so, how the kvm is configured(it may depend on the real hardware)?
We do not use KVM and are running the Qemu version (10.0.0). > > > > > If you all cannot reproduce on real hardware, I'm considering this a > > qemu bug. It is reproducible intermittently on x86_64 device and qemu-x86 device with and without compat mode. This link is showing how intermittent it is on Linux next tree. - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250606/testrun/28685600/suite/log-parser-test/test/oops-oops-int3-smp-pti/history/?page=2 - Naresh > > OK, if it is a qemu's bug, dropping [2/2], but I think we still need > [1/2] to avoid kernel crash (with a warning message without dump). > > Thank you, > > > > > > > > -- > Masami Hiramatsu (Google) <mhira...@kernel.org>