Re: __rtc_read_alarm missing month/year field bug?

2016-07-21 Thread Joe Lawrence
On 07/19/2016, Alexandre Belloni wrote:
>
> Well like said in my previous mail, I don't think the rollover is the
> issue here but I'm interested in knowing what conditions are leading >
to endless interrupts.

Hi Alexandre,

Unfortunately I've switched employers so I no longer have access to the
hardware, but if I remember correctly the scenario went like this:

  - boot machine with RTC alarm set in far future
  - run 'hwclock' user program
- hwclock sync to clock tick
  - RTC_UIE_ON
  - RTC_UIE_OFF
  - kernel sees RTC alarm is outstanding
- hpet_rtc_interrupt's start streaming in

In the case of this hardware platform, there was a problem with the CMOS
RTC read such that  hpet_rtc_interrupt / rtc_cmos_read believed that the
RTC was busy updating ... that code would delay long enough that the
next HPET rtc interrupt would come in right behind it and choke the CPU
from scheduling anything else.

That platform-specific bug aside, I thought I would report this
strangeness in case it was a real bug -- looks like I didn't fully
consider the rollover case.

-- Joe


Re: __rtc_read_alarm missing month/year field bug?

2016-07-21 Thread Joe Lawrence
On 07/19/2016, Alexandre Belloni wrote:
>
> Well like said in my previous mail, I don't think the rollover is the
> issue here but I'm interested in knowing what conditions are leading >
to endless interrupts.

Hi Alexandre,

Unfortunately I've switched employers so I no longer have access to the
hardware, but if I remember correctly the scenario went like this:

  - boot machine with RTC alarm set in far future
  - run 'hwclock' user program
- hwclock sync to clock tick
  - RTC_UIE_ON
  - RTC_UIE_OFF
  - kernel sees RTC alarm is outstanding
- hpet_rtc_interrupt's start streaming in

In the case of this hardware platform, there was a problem with the CMOS
RTC read such that  hpet_rtc_interrupt / rtc_cmos_read believed that the
RTC was busy updating ... that code would delay long enough that the
next HPET rtc interrupt would come in right behind it and choke the CPU
from scheduling anything else.

That platform-specific bug aside, I thought I would report this
strangeness in case it was a real bug -- looks like I didn't fully
consider the rollover case.

-- Joe


Re: __rtc_read_alarm missing month/year field bug?

2016-07-19 Thread Alexandre Belloni
On 27/06/2016 at 13:36:04 -0400, Joe Lawrence wrote :
> Ping?  This isn't a major problem, but can setup endless RTC interrupts
> under certain conditions on said hardware.
> 

Well like said in my previous mail, I don't think the rollover is the
issue here but I'm interested in knowing what conditions are leading to
endless interrupts.

-- 
Alexandre Belloni, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


Re: __rtc_read_alarm missing month/year field bug?

2016-07-19 Thread Alexandre Belloni
On 27/06/2016 at 13:36:04 -0400, Joe Lawrence wrote :
> Ping?  This isn't a major problem, but can setup endless RTC interrupts
> under certain conditions on said hardware.
> 

Well like said in my previous mail, I don't think the rollover is the
issue here but I'm interested in knowing what conditions are leading to
endless interrupts.

-- 
Alexandre Belloni, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


Re: __rtc_read_alarm missing month/year field bug?

2016-07-19 Thread Alexandre Belloni
Hi,

On 20/06/2016 at 12:06:38 -0400, Joe Lawrence wrote :
> Hello Alessandro and Alexandre,
> 
> I noticed an interesting cmos_rtc.rtc.aie_timer on a Stratus machine
> running the 4.6 kernel, with an expiration time that puts the alarm way
> out into next year.  This is easily reproducible on this machine by
> setting a wakealarm sometime in the near future, then rebooting.
> 
> From a fresh boot:
> 
>   % cat /proc/driver/rtc
>   rtc_time: 17:55:10
>   rtc_date: 2016-06-09
>   alrm_time   : 14:04:37
>   alrm_date   : 2017-06-09 << 2017 ?

Well, alrm_time is before rtc_time so it either already expired or is set
for the future. The code assume it is set in the future (hence the
rollover).

What you patch will do in that case is effectively always remove the
rollover.

>   alarm_IRQ   : no
>   alrm_pending: no
>   update IRQ enabled  : no
>   periodic IRQ enabled: no
>   periodic IRQ frequency  : 1024
>   max user IRQ frequency  : 64
>   24hr: yes
>   periodic_IRQ: no
>   update_IRQ  : no
>   HPET_emulated   : yes
>   BCD : yes
>   DST_enable  : no
>   periodic_freq   : 1024
>   batt_status : okay
> 
> 
> I added some debugging code to the kernel, saw this on the next boot:
> 
>   __rtc_read_alarm: A - alarm->time.tm_year = -1, missing = 0
>   __rtc_read_alarm: B - alarm->time.tm_year = 116, missing = 3
>   __rtc_read_alarm: C - alarm->time.tm_year = 117
> 
> 
> Corresponding to these parts of __rtc_read_alarm:
> 
>   int __rtc_read_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
>   ...
>   enum { none, day, month, year } missing = none;
>   ...
>   err = rtc_read_alarm_internal(rtc, alarm);
>   ...
>   /* Fill in the missing alarm fields using the timestamp; we
>* know there's at least one since alarm->time is invalid.
>*/
>   ...
>   [A]
>   if (alarm->time.tm_year == -1) {
>   alarm->time.tm_year = now.tm_year;
>   if (missing == none)
>   missing = year;
>   }
>   [B]
>   ...
>   switch (missing) {
>   ...
>   /* Year rollover ... easy except for leap years! */
>   case year:
>   dev_dbg(>dev, "alarm rollover: %s\n", "year");
>   do {
>   alarm->time.tm_year++;
>   } while (!is_leap_year(alarm->time.tm_year + 1900)
>   && rtc_valid_tm(>time) != 0);
>   [C] break;
> 
> 
> I noticed that the missing year and month cases increment their
> respective time units inside a do ... while (condition) loop, pushing
> the default 'filled-in' values to now + 1.
> 
> Should this 'roll-over' code check for a valid date before incrementing
> the alarm time?  (See attached patch.)  I think this might also apply to
> a missing month field as well.
> 
> (After the patch + reboot):
> 
>   % cat /proc/driver/rtc
>   rtc_time: 18:24:02
>   rtc_date: 2016-06-09
>   alrm_time   : 14:04:37
>   alrm_date   : 2016-06-09
>   alarm_IRQ   : no
>   alrm_pending: no
>   update IRQ enabled  : no
>   periodic IRQ enabled: no
>   periodic IRQ frequency  : 1024
>   max user IRQ frequency  : 64
>   24hr: yes
>   periodic_IRQ: no
>   update_IRQ  : no
>   HPET_emulated   : yes
>   BCD : yes
>   DST_enable  : no
>   periodic_freq   : 1024
>   batt_status : okay
> 
> -- >8 --
> 
> From d6feacf20b312c8ebfee902b8b84f68c1a82f035 Mon Sep 17 00:00:00 2001
> From: Joe Lawrence 
> Date: Thu, 9 Jun 2016 14:52:28 -0400
> Subject: [PATCH] rtc: check filled-in alarm values before incrementing
> 
> In __rtc_read_alarm, check filled-in alarm->time.tm_year values (those
> not returned by the RTC and defaulted to now.tm_year) before
> incrementing them in the rollover handling case.
> 
> Signed-off-by: Joe Lawrence 
> ---
>  drivers/rtc/interface.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
> index 9ef5f6f89f98..3098ce4167ef 100644
> --- a/drivers/rtc/interface.c
> +++ b/drivers/rtc/interface.c
> @@ -258,10 +258,10 @@ int __rtc_read_alarm(struct rtc_device *rtc,
> struct rtc_wkalrm *alarm)
>   /* Year rollover ... easy except for leap years! */
>   case year:
>   dev_dbg(>dev, "alarm rollover: %s\n", "year");
> - do {
> + while (!is_leap_year(alarm->time.tm_year + 1900)
> + && rtc_valid_tm(>time) != 0) {
>   alarm->time.tm_year++;
> - } while (!is_leap_year(alarm->time.tm_year + 1900)
> - && rtc_valid_tm(>time) != 0);
> + }
>   break;
> 
>   default:
> -- 
> 1.8.3.1
> 

-- 
Alexandre Belloni, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


Re: __rtc_read_alarm missing month/year field bug?

2016-07-19 Thread Alexandre Belloni
Hi,

On 20/06/2016 at 12:06:38 -0400, Joe Lawrence wrote :
> Hello Alessandro and Alexandre,
> 
> I noticed an interesting cmos_rtc.rtc.aie_timer on a Stratus machine
> running the 4.6 kernel, with an expiration time that puts the alarm way
> out into next year.  This is easily reproducible on this machine by
> setting a wakealarm sometime in the near future, then rebooting.
> 
> From a fresh boot:
> 
>   % cat /proc/driver/rtc
>   rtc_time: 17:55:10
>   rtc_date: 2016-06-09
>   alrm_time   : 14:04:37
>   alrm_date   : 2017-06-09 << 2017 ?

Well, alrm_time is before rtc_time so it either already expired or is set
for the future. The code assume it is set in the future (hence the
rollover).

What you patch will do in that case is effectively always remove the
rollover.

>   alarm_IRQ   : no
>   alrm_pending: no
>   update IRQ enabled  : no
>   periodic IRQ enabled: no
>   periodic IRQ frequency  : 1024
>   max user IRQ frequency  : 64
>   24hr: yes
>   periodic_IRQ: no
>   update_IRQ  : no
>   HPET_emulated   : yes
>   BCD : yes
>   DST_enable  : no
>   periodic_freq   : 1024
>   batt_status : okay
> 
> 
> I added some debugging code to the kernel, saw this on the next boot:
> 
>   __rtc_read_alarm: A - alarm->time.tm_year = -1, missing = 0
>   __rtc_read_alarm: B - alarm->time.tm_year = 116, missing = 3
>   __rtc_read_alarm: C - alarm->time.tm_year = 117
> 
> 
> Corresponding to these parts of __rtc_read_alarm:
> 
>   int __rtc_read_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
>   ...
>   enum { none, day, month, year } missing = none;
>   ...
>   err = rtc_read_alarm_internal(rtc, alarm);
>   ...
>   /* Fill in the missing alarm fields using the timestamp; we
>* know there's at least one since alarm->time is invalid.
>*/
>   ...
>   [A]
>   if (alarm->time.tm_year == -1) {
>   alarm->time.tm_year = now.tm_year;
>   if (missing == none)
>   missing = year;
>   }
>   [B]
>   ...
>   switch (missing) {
>   ...
>   /* Year rollover ... easy except for leap years! */
>   case year:
>   dev_dbg(>dev, "alarm rollover: %s\n", "year");
>   do {
>   alarm->time.tm_year++;
>   } while (!is_leap_year(alarm->time.tm_year + 1900)
>   && rtc_valid_tm(>time) != 0);
>   [C] break;
> 
> 
> I noticed that the missing year and month cases increment their
> respective time units inside a do ... while (condition) loop, pushing
> the default 'filled-in' values to now + 1.
> 
> Should this 'roll-over' code check for a valid date before incrementing
> the alarm time?  (See attached patch.)  I think this might also apply to
> a missing month field as well.
> 
> (After the patch + reboot):
> 
>   % cat /proc/driver/rtc
>   rtc_time: 18:24:02
>   rtc_date: 2016-06-09
>   alrm_time   : 14:04:37
>   alrm_date   : 2016-06-09
>   alarm_IRQ   : no
>   alrm_pending: no
>   update IRQ enabled  : no
>   periodic IRQ enabled: no
>   periodic IRQ frequency  : 1024
>   max user IRQ frequency  : 64
>   24hr: yes
>   periodic_IRQ: no
>   update_IRQ  : no
>   HPET_emulated   : yes
>   BCD : yes
>   DST_enable  : no
>   periodic_freq   : 1024
>   batt_status : okay
> 
> -- >8 --
> 
> From d6feacf20b312c8ebfee902b8b84f68c1a82f035 Mon Sep 17 00:00:00 2001
> From: Joe Lawrence 
> Date: Thu, 9 Jun 2016 14:52:28 -0400
> Subject: [PATCH] rtc: check filled-in alarm values before incrementing
> 
> In __rtc_read_alarm, check filled-in alarm->time.tm_year values (those
> not returned by the RTC and defaulted to now.tm_year) before
> incrementing them in the rollover handling case.
> 
> Signed-off-by: Joe Lawrence 
> ---
>  drivers/rtc/interface.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
> index 9ef5f6f89f98..3098ce4167ef 100644
> --- a/drivers/rtc/interface.c
> +++ b/drivers/rtc/interface.c
> @@ -258,10 +258,10 @@ int __rtc_read_alarm(struct rtc_device *rtc,
> struct rtc_wkalrm *alarm)
>   /* Year rollover ... easy except for leap years! */
>   case year:
>   dev_dbg(>dev, "alarm rollover: %s\n", "year");
> - do {
> + while (!is_leap_year(alarm->time.tm_year + 1900)
> + && rtc_valid_tm(>time) != 0) {
>   alarm->time.tm_year++;
> - } while (!is_leap_year(alarm->time.tm_year + 1900)
> - && rtc_valid_tm(>time) != 0);
> + }
>   break;
> 
>   default:
> -- 
> 1.8.3.1
> 

-- 
Alexandre Belloni, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


Re: __rtc_read_alarm missing month/year field bug?

2016-06-27 Thread Joe Lawrence
On 06/20/2016 12:06 PM, Joe Lawrence wrote:
> Hello Alessandro and Alexandre,
> 
> I noticed an interesting cmos_rtc.rtc.aie_timer on a Stratus machine
> running the 4.6 kernel, with an expiration time that puts the alarm way
> out into next year.  This is easily reproducible on this machine by
> setting a wakealarm sometime in the near future, then rebooting.
> 
> From a fresh boot:
> 
>   % cat /proc/driver/rtc
>   rtc_time: 17:55:10
>   rtc_date: 2016-06-09
>   alrm_time   : 14:04:37
>   alrm_date   : 2017-06-09 << 2017 ?
>   alarm_IRQ   : no
>   alrm_pending: no
>   update IRQ enabled  : no
>   periodic IRQ enabled: no
>   periodic IRQ frequency  : 1024
>   max user IRQ frequency  : 64
>   24hr: yes
>   periodic_IRQ: no
>   update_IRQ  : no
>   HPET_emulated   : yes
>   BCD : yes
>   DST_enable  : no
>   periodic_freq   : 1024
>   batt_status : okay
> 
> 
> I added some debugging code to the kernel, saw this on the next boot:
> 
>   __rtc_read_alarm: A - alarm->time.tm_year = -1, missing = 0
>   __rtc_read_alarm: B - alarm->time.tm_year = 116, missing = 3
>   __rtc_read_alarm: C - alarm->time.tm_year = 117
> 
> 
> Corresponding to these parts of __rtc_read_alarm:
> 
>   int __rtc_read_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
>   ...
>   enum { none, day, month, year } missing = none;
>   ...
>   err = rtc_read_alarm_internal(rtc, alarm);
>   ...
>   /* Fill in the missing alarm fields using the timestamp; we
>* know there's at least one since alarm->time is invalid.
>*/
>   ...
>   [A]
>   if (alarm->time.tm_year == -1) {
>   alarm->time.tm_year = now.tm_year;
>   if (missing == none)
>   missing = year;
>   }
>   [B]
>   ...
>   switch (missing) {
>   ...
>   /* Year rollover ... easy except for leap years! */
>   case year:
>   dev_dbg(>dev, "alarm rollover: %s\n", "year");
>   do {
>   alarm->time.tm_year++;
>   } while (!is_leap_year(alarm->time.tm_year + 1900)
>   && rtc_valid_tm(>time) != 0);
>   [C] break;
> 
> 
> I noticed that the missing year and month cases increment their
> respective time units inside a do ... while (condition) loop, pushing
> the default 'filled-in' values to now + 1.
> 
> Should this 'roll-over' code check for a valid date before incrementing
> the alarm time?  (See attached patch.)  I think this might also apply to
> a missing month field as well.
> 
> (After the patch + reboot):
> 
>   % cat /proc/driver/rtc
>   rtc_time: 18:24:02
>   rtc_date: 2016-06-09
>   alrm_time   : 14:04:37
>   alrm_date   : 2016-06-09
>   alarm_IRQ   : no
>   alrm_pending: no
>   update IRQ enabled  : no
>   periodic IRQ enabled: no
>   periodic IRQ frequency  : 1024
>   max user IRQ frequency  : 64
>   24hr: yes
>   periodic_IRQ: no
>   update_IRQ  : no
>   HPET_emulated   : yes
>   BCD : yes
>   DST_enable  : no
>   periodic_freq   : 1024
>   batt_status : okay
> 
> -- >8 --
> 
> From d6feacf20b312c8ebfee902b8b84f68c1a82f035 Mon Sep 17 00:00:00 2001
> From: Joe Lawrence 
> Date: Thu, 9 Jun 2016 14:52:28 -0400
> Subject: [PATCH] rtc: check filled-in alarm values before incrementing
> 
> In __rtc_read_alarm, check filled-in alarm->time.tm_year values (those
> not returned by the RTC and defaulted to now.tm_year) before
> incrementing them in the rollover handling case.
> 
> Signed-off-by: Joe Lawrence 
> ---
>  drivers/rtc/interface.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
> index 9ef5f6f89f98..3098ce4167ef 100644
> --- a/drivers/rtc/interface.c
> +++ b/drivers/rtc/interface.c
> @@ -258,10 +258,10 @@ int __rtc_read_alarm(struct rtc_device *rtc,
> struct rtc_wkalrm *alarm)
>   /* Year rollover ... easy except for leap years! */
>   case year:
>   dev_dbg(>dev, "alarm rollover: %s\n", "year");
> - do {
> + while (!is_leap_year(alarm->time.tm_year + 1900)
> + && rtc_valid_tm(>time) != 0) {
>   alarm->time.tm_year++;
> - } while (!is_leap_year(alarm->time.tm_year + 1900)
> - && rtc_valid_tm(>time) != 0);
> + }
>   break;
> 
>   default:
> 

Ping?  This isn't a major problem, but can setup endless RTC interrupts
under certain conditions on said hardware.


-- Joe


Re: __rtc_read_alarm missing month/year field bug?

2016-06-27 Thread Joe Lawrence
On 06/20/2016 12:06 PM, Joe Lawrence wrote:
> Hello Alessandro and Alexandre,
> 
> I noticed an interesting cmos_rtc.rtc.aie_timer on a Stratus machine
> running the 4.6 kernel, with an expiration time that puts the alarm way
> out into next year.  This is easily reproducible on this machine by
> setting a wakealarm sometime in the near future, then rebooting.
> 
> From a fresh boot:
> 
>   % cat /proc/driver/rtc
>   rtc_time: 17:55:10
>   rtc_date: 2016-06-09
>   alrm_time   : 14:04:37
>   alrm_date   : 2017-06-09 << 2017 ?
>   alarm_IRQ   : no
>   alrm_pending: no
>   update IRQ enabled  : no
>   periodic IRQ enabled: no
>   periodic IRQ frequency  : 1024
>   max user IRQ frequency  : 64
>   24hr: yes
>   periodic_IRQ: no
>   update_IRQ  : no
>   HPET_emulated   : yes
>   BCD : yes
>   DST_enable  : no
>   periodic_freq   : 1024
>   batt_status : okay
> 
> 
> I added some debugging code to the kernel, saw this on the next boot:
> 
>   __rtc_read_alarm: A - alarm->time.tm_year = -1, missing = 0
>   __rtc_read_alarm: B - alarm->time.tm_year = 116, missing = 3
>   __rtc_read_alarm: C - alarm->time.tm_year = 117
> 
> 
> Corresponding to these parts of __rtc_read_alarm:
> 
>   int __rtc_read_alarm(struct rtc_device *rtc, struct rtc_wkalrm *alarm)
>   ...
>   enum { none, day, month, year } missing = none;
>   ...
>   err = rtc_read_alarm_internal(rtc, alarm);
>   ...
>   /* Fill in the missing alarm fields using the timestamp; we
>* know there's at least one since alarm->time is invalid.
>*/
>   ...
>   [A]
>   if (alarm->time.tm_year == -1) {
>   alarm->time.tm_year = now.tm_year;
>   if (missing == none)
>   missing = year;
>   }
>   [B]
>   ...
>   switch (missing) {
>   ...
>   /* Year rollover ... easy except for leap years! */
>   case year:
>   dev_dbg(>dev, "alarm rollover: %s\n", "year");
>   do {
>   alarm->time.tm_year++;
>   } while (!is_leap_year(alarm->time.tm_year + 1900)
>   && rtc_valid_tm(>time) != 0);
>   [C] break;
> 
> 
> I noticed that the missing year and month cases increment their
> respective time units inside a do ... while (condition) loop, pushing
> the default 'filled-in' values to now + 1.
> 
> Should this 'roll-over' code check for a valid date before incrementing
> the alarm time?  (See attached patch.)  I think this might also apply to
> a missing month field as well.
> 
> (After the patch + reboot):
> 
>   % cat /proc/driver/rtc
>   rtc_time: 18:24:02
>   rtc_date: 2016-06-09
>   alrm_time   : 14:04:37
>   alrm_date   : 2016-06-09
>   alarm_IRQ   : no
>   alrm_pending: no
>   update IRQ enabled  : no
>   periodic IRQ enabled: no
>   periodic IRQ frequency  : 1024
>   max user IRQ frequency  : 64
>   24hr: yes
>   periodic_IRQ: no
>   update_IRQ  : no
>   HPET_emulated   : yes
>   BCD : yes
>   DST_enable  : no
>   periodic_freq   : 1024
>   batt_status : okay
> 
> -- >8 --
> 
> From d6feacf20b312c8ebfee902b8b84f68c1a82f035 Mon Sep 17 00:00:00 2001
> From: Joe Lawrence 
> Date: Thu, 9 Jun 2016 14:52:28 -0400
> Subject: [PATCH] rtc: check filled-in alarm values before incrementing
> 
> In __rtc_read_alarm, check filled-in alarm->time.tm_year values (those
> not returned by the RTC and defaulted to now.tm_year) before
> incrementing them in the rollover handling case.
> 
> Signed-off-by: Joe Lawrence 
> ---
>  drivers/rtc/interface.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
> index 9ef5f6f89f98..3098ce4167ef 100644
> --- a/drivers/rtc/interface.c
> +++ b/drivers/rtc/interface.c
> @@ -258,10 +258,10 @@ int __rtc_read_alarm(struct rtc_device *rtc,
> struct rtc_wkalrm *alarm)
>   /* Year rollover ... easy except for leap years! */
>   case year:
>   dev_dbg(>dev, "alarm rollover: %s\n", "year");
> - do {
> + while (!is_leap_year(alarm->time.tm_year + 1900)
> + && rtc_valid_tm(>time) != 0) {
>   alarm->time.tm_year++;
> - } while (!is_leap_year(alarm->time.tm_year + 1900)
> - && rtc_valid_tm(>time) != 0);
> + }
>   break;
> 
>   default:
> 

Ping?  This isn't a major problem, but can setup endless RTC interrupts
under certain conditions on said hardware.


-- Joe