On Thu, Nov 13, 2025 at 05:31:42PM +0800, Zhao Liu wrote:
> Date: Thu, 13 Nov 2025 17:31:42 +0800
> From: Zhao Liu <[email protected]>
> Subject: Re: [PATCH 21/22] rust/hpet: Replace BqlRefCell<HPETRegisters>
> with Mutex<HPETRegisters>
>
> > @@ -179,8 +180,8 @@ const fn deactivating_bit(old: u64, new: u64, shift:
> > usize) -> bool {
> > fn timer_handler(timer_cell: &BqlRefCell<HPETTimer>) {
> > let mut t = timer_cell.borrow_mut();
> > // SFAETY: state field is valid after timer initialization.
> > - let regs = &mut unsafe { t.state.as_mut() }.regs.borrow_mut();
> > - t.callback(regs)
> > + let mut regs = unsafe { t.state.as_ref() }.regs.lock().unwrap();
> > + t.callback(&mut regs)
> > }
>
> callback()
> -> arm_timer(): access timer N register
> -> update_irq(): modify global register (int_status or "isr" in C code)
>
> So timer handler needs to lock Mutex. But this may cause deadlock:
>
> timer_hanlder -> lock BQL -> try to lock Mutex
> MMIO access -> lock Mutex -> try to lock BQL
>
> C HPET doesn't have such deadlock issue since it doesn't lock Mutex in
> timer handler.
>
> I think it seems necessay to lock Mutex in timer handler since there's
> no guarantee to avoid data race...
One possible way may be to introduce lockless timer callback, but at
Rust side, this needs to extract timers from BqlRefCell and add extra
Muetx to protect timer state.
So a simple way is to just unlock bql before acquiring Mutex in timer
handler, which give a chance for MMIO to acquire BQL. And this way could
fix locking order in timer handler.
Code example:
diff --git a/rust/hw/timer/hpet/src/device.rs b/rust/hw/timer/hpet/src/device.rs
index f96dfe1ebd06..389eb9b49eb6 100644
--- a/rust/hw/timer/hpet/src/device.rs
+++ b/rust/hw/timer/hpet/src/device.rs
@@ -178,10 +178,35 @@ const fn deactivating_bit(old: u64, new: u64, shift:
usize) -> bool {
}
fn timer_handler(timer_cell: &BqlRefCell<HPETTimer>) {
- let mut t = timer_cell.borrow_mut();
- // SFAETY: state field is valid after timer initialization.
- let mut regs = unsafe { t.state.as_ref() }.regs.lock().unwrap();
- t.callback(&mut regs)
+ let state_p = {
+ let t = timer_cell.borrow();
+ t.state
+ };
+
+ // Release BQL first and acquire Mutex instead. This avoids deadlock
+ // since lockless IO will lock Mutex first and then try to acquire
+ // BQL.
+ //
+ // SAFETY: BQL free context only locks Mutex and will do nothing else.
+ unsafe {
+ bql::unlock();
+ }
+
+ // SAFETY: state_p is valid and we just access Mutex and don't touch
+ // other fields. Mutex could guarantee the registers access is safe
+ // during BQL is unlocked.
+ let mut regs = unsafe { state_p.as_ref() }.regs.lock().unwrap();
+
+ // After Mutex is locked, lock BQL again. This ensures both timer
+ // handler and MMIO have the same locking order.
+ //
+ // SAFETY: BQL context is expected for timer handler and now the
+ // correct locking order eliminates deadlock.
+ unsafe {
+ bql::lock();
+ }
+
+ timer_cell.borrow_mut().callback(&mut regs);
}
#[repr(C)]
Thanks,
Zhao