We have observed on few machines with rtc-cmos device that hpet_rtc_interrupt() is called before cmos_do_probe() could call hpet_rtc_timer_init(). It has not been observed during normal boot/reboot of machines. It *sometime* happens when system is booted with kdump secondary kernel. So, neither hpet_default_delta nor hpet_t1_cmp is initialized by the time interrupt is raised in the given situation. Therefore while loop of hpet_cnt_ahead() in hpet_rtc_timer_reinit() never completes. This leads to "NMI watchdog: Watchdog detected hard LOCKUP on cpu 0".
I am still clueless, how can an interrupt be raised before RTC is enabled. But I do not have any idea about this device, so I am putting this patch as RFC to get feedback from hpet/rtc-cmos developer. I am initializing software counters in this patch so that LOCKUP could be avoided. Even if it resolves the issue, I understand that proposed patch may not be the best way to solve this issue. Signed-off-by: Pratyush Anand <pan...@redhat.com> --- drivers/rtc/rtc-cmos.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/rtc/rtc-cmos.c b/drivers/rtc/rtc-cmos.c index fbe9c72438e1..101dc948295f 100644 --- a/drivers/rtc/rtc-cmos.c +++ b/drivers/rtc/rtc-cmos.c @@ -129,6 +129,16 @@ static inline int hpet_rtc_dropped_irq(void) return 0; } +static inline int hpet_rtc_timer_counter_init(void) +{ + return 0; +} + +static inline int hpet_rtc_timer_enable(void) +{ + return 0; +} + static inline int hpet_rtc_timer_init(void) { return 0; @@ -710,6 +720,7 @@ cmos_do_probe(struct device *dev, struct resource *ports, int rtc_irq) goto cleanup1; } + hpet_rtc_timer_counter_init(); if (is_valid_irq(rtc_irq)) { irq_handler_t rtc_cmos_int_handler; @@ -732,7 +743,7 @@ cmos_do_probe(struct device *dev, struct resource *ports, int rtc_irq) goto cleanup1; } } - hpet_rtc_timer_init(); + hpet_rtc_timer_enable(); /* export at least the first block of NVRAM */ nvram.size = address_space - NVRAM_OFFSET; -- 2.5.5