Re: Strange issue after early AP startup
On 01/23/17 06:27, Jia-Shiun Li wrote: On Sat, Jan 21, 2017 at 3:51 PM, Hans Petter Selasky wrote: FYI: https://svnweb.freebsd.org/changeset/base/312551 Hi hps, sorry I have to correct my test results. I found that I did not revert changes to kernel config, which commented out EARLY_AP_STARTUP option, before testing you patches. So they were tested without EARLY_AP_STARTUP. I tested again your patches with kernel config reverted, and they (and head as of r312620) did not solve the c2d system time lag issue. vmstat results attached. And also my apology for this stupid mistake. This might be another issue, not related to my patch. --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Sat, Jan 21, 2017 at 3:51 PM, Hans Petter Selasky wrote: > FYI: > > https://svnweb.freebsd.org/changeset/base/312551 > > Hi hps, sorry I have to correct my test results. I found that I did not revert changes to kernel config, which commented out EARLY_AP_STARTUP option, before testing you patches. So they were tested without EARLY_AP_STARTUP. I tested again your patches with kernel config reverted, and they (and head as of r312620) did not solve the c2d system time lag issue. vmstat results attached. And also my apology for this stupid mistake. -Jia-Shiun. interrupt total rate ???0 0 irq1: atkbd0 2 0 stray irq1 0 0 irq0: attimer0 0 0 stray irq0 0 0 irq3: 0 0 stray irq3 0 0 irq4: uart00 0 stray irq4 0 0 irq5: 0 0 stray irq5 0 0 irq6: 0 0 stray irq6 0 0 irq7: 0 0 stray irq7 0 0 irq8: atrtc0 0 0 stray irq8 0 0 irq9: acpi00 0 stray irq9 0 0 irq10: 0 0 stray irq100 0 irq11: 0 0 stray irq110 0 irq12: 0 0 stray irq120 0 irq13: 0 0 stray irq130 0 irq14: 0 0 stray irq140 0 irq15: 0 0 stray irq150 0 irq16: em0:irq0++ 18 0 stray irq160 0 irq17: 0 0 stray irq170 0 irq18: uhci2 ehci0+ 18 0 stray irq180 0 irq19: uhci4 0 0 stray irq190 0 irq20: hpet0 12503277 stray irq200 0 irq21: uhci1 0 0 stray irq210 0 irq22: 0 0 stray irq220 0 irq23: uhci3 ehci1 0 0 stray irq230 0 cpu0:timer 0 0 cpu1:timer 0 0 irq256: hdac0 97 2 stray irq256 0 0 irq257: pcib1 0 0 stray irq257 0 0 irq258: pcib2 0 0 stray irq258 0 0 irq259: pcib3 0 0 stray irq259 0 0 irq260: re0 9049201 stray irq260 0 0 irq261: ahci0:ch0 4186 93 stray irq261 0 0 irq262: ahci0:ch1 0 0 stray irq262 0 0 irq263: ahci0:ch2 0 0 stray irq263 0 0 irq264: ahci0:ch3 0 0 stray irq264 0 0 irq265: ahci0:ch4 0 0 stray irq265 0 0 irq266: ahci0:ch5 0 0 stray irq266 0 0 irq267: ahci0:60 0 stray irq267 0 0 irq268: ahci0:70 0 stray irq268 0 0 irq269: ahci0:80 0 stray irq269 0 0 irq270: ahci0:90 0 stray irq270 0 0 irq271: ahci0:10 0 0 stray irq271 0 0 irq272: ahci0:11 0 0 stray irq272 0 0 irq273: ahci0:12 0 0 stray i
Re: Strange issue after early AP startup
FYI: https://svnweb.freebsd.org/changeset/base/312551 --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Thu, Jan 19, 2017 at 4:23 PM, Hans Petter Selasky wrote: > > I can add prints/asserts to show that what happens is that > "state->nextcallopt > now" while "state->nextcall <= now". This situtation > is allowed to persist due to the way getnextcpuevent() is currently > implemented. > > Can the people CC'ed give the attached patch a spin and report back? > > As far as c2d system time is concerned, it works correctly for me w/ r312210. -Jia-Shiun. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On 01/18/17 09:00, Hans Petter Selasky wrote: On 01/18/17 02:18, John Baldwin wrote: You might still want to adjust 'nextevent' to schedule the next interrupt to be sooner than 'timerperiod' though. You could just set 'nextevent' to 'now' in that case instead of 'next'. Right, I'll give that a spin. Would have to be "now + 1" instead of "now", due to check before et_start() ? Hi John, Here is another variant of my patch which solves the EARLY AP startup problem with timers. What do you think? > diff --git a/sys/kern/kern_clocksource.c b/sys/kern/kern_clocksource.c > index 7f7769d..8bacff6 100644 > --- a/sys/kern/kern_clocksource.c > +++ b/sys/kern/kern_clocksource.c > @@ -207,7 +207,7 @@ handleevents(sbintime_t now, int fake) > } > } else > state->nextprof = state->nextstat; > - if (now >= state->nextcallopt) { > + if (now >= state->nextcallopt || now >= state->nextcall) { > state->nextcall = state->nextcallopt = SBT_MAX; > callout_process(now); > } I can add prints/asserts to show that what happens is that "state->nextcallopt > now" while "state->nextcall <= now". This situtation is allowed to persist due to the way getnextcpuevent() is currently implemented. Can the people CC'ed give the attached patch a spin and report back? --HPS diff --git a/sys/kern/kern_clocksource.c b/sys/kern/kern_clocksource.c index 7f7769d..8bacff6 100644 --- a/sys/kern/kern_clocksource.c +++ b/sys/kern/kern_clocksource.c @@ -207,7 +207,7 @@ handleevents(sbintime_t now, int fake) } } else state->nextprof = state->nextstat; - if (now >= state->nextcallopt) { + if (now >= state->nextcallopt || now >= state->nextcall) { state->nextcall = state->nextcallopt = SBT_MAX; callout_process(now); } ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
In message <1922021.4hjeqfj...@ralph.baldwin.cx>, John Baldwin writes: > On Tuesday, January 17, 2017 05:08:58 PM Cy Schubert wrote: > > In message <1492450.xzfnz8z...@ralph.baldwin.cx>, John Baldwin writes: > > > On Tuesday, January 17, 2017 12:53:19 PM Cy Schubert wrote: > > > > In message , Hans Pet > ter > > > > Sela > > > > sky writes: > > > > > Hi, > > > > > > > > > > When booting I observe an additional 30-second delay after this print > : > > > > > > > > > > > Timecounters tick every 1.000 msec > > > > > > > > > > ~30 second delay and boot continues like normal. > > > > > > > > > > Checking "vmstat -i" reveals that some timers have been running loose > . > > > > > > > > > > > cpu0:timer 44300442 > > > > > > cpu1:timer 40561404 > > > > > > cpu3:timer 48462822 483058 > > > > > > cpu2:timer 48477898 483209 > > > > > > > > > > Trying to add delays and/or prints around the Timecounters printout > > > > > makes the issue go away. Any ideas for debugging? > > > > > > > > > > Looks like a startup race to me. > > > > > > > > just picking a random email to reply to, I'm seeing a different issue w > ith > > > > early AP startup. It affects one of my four machines, my laptop. My thr > ee > > > > server systems downstairs have no problem however my laptop will reboot > > > > > repeatedly at: > > > > > > > > Jan 17 11:55:16 slippy kernel: cd0: Attempt to query device size failed > : > > > > NOT READY, Medium not present - tray closed > > > > > > So it panics and reboots after this? > > > > Yes, it goes into a panic/reboot loop for a few iterations until it > > successfully boots. Disabling early AP startup allows it to boot up without > > > the assumed race. > > Can you add DDB to the kernel config (and remove DDB_UNATTENDED) to get it > to break into DDB when it panics to get the panic message (and a stack trace > as well)? I found and fixed the problem. It was in some code I had added a long time ago but not committed yet to the bge driver to implement WOL. It was a lock assertion. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Wednesday, January 18, 2017 09:00:52 AM Hans Petter Selasky wrote: > On 01/18/17 02:18, John Baldwin wrote: > > You might still want to adjust 'nextevent' to schedule the next interrupt > > to be sooner than 'timerperiod' though. You could just set 'nextevent' to > > 'now' in that case instead of 'next'. > > Right, I'll give that a spin. Would have to be "now + 1" instead of > "now", due to check before et_start() ? Ugh, ok. + 1 with sbintime_t is kind of odd which is why I would have liked to avoid it, but it seems it is required. -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On 01/18/17 02:18, John Baldwin wrote: You might still want to adjust 'nextevent' to schedule the next interrupt to be sooner than 'timerperiod' though. You could just set 'nextevent' to 'now' in that case instead of 'next'. Right, I'll give that a spin. Would have to be "now + 1" instead of "now", due to check before et_start() ? --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On 01/18/17 02:18, John Baldwin wrote: Note that 'nextevent' remains a full 'timerperiod' out (now + timerperiod) and so the first clock interrupt is still 'timerperiod' time away and any callouts are delayed by that amount of time. Also, I think you could set nextcallopt to 'now' rather than 'now + 1'. Hi, Does that mean the following piece of code is missing from getnextevent(): /* Handle callout events. */ if (event > state->nextcall) event = state->nextcall; Like getnextcpuevent() is doing? --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
Hi, On 01/18/17 02:18, John Baldwin wrote: Also, I think you could set nextcallopt to 'now' rather than 'now + 1'. There is a check in loadtimer() if next == now, and then the event timer is not started ?? } else { new = getnextevent(); eq = (new == *next); CTR4(KTR_SPARE2, "load at %d:next %d.%08x eq %d", curcpu, (int)(new >> 32), (u_int)(new & 0x), eq); if (!eq) { *next = new; et_start(timer, new - now, 0); } } --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Tuesday, January 17, 2017 05:08:58 PM Cy Schubert wrote: > In message <1492450.xzfnz8z...@ralph.baldwin.cx>, John Baldwin writes: > > On Tuesday, January 17, 2017 12:53:19 PM Cy Schubert wrote: > > > In message , Hans > > > Petter > > > Sela > > > sky writes: > > > > Hi, > > > > > > > > When booting I observe an additional 30-second delay after this print: > > > > > > > > > Timecounters tick every 1.000 msec > > > > > > > > ~30 second delay and boot continues like normal. > > > > > > > > Checking "vmstat -i" reveals that some timers have been running loose. > > > > > > > > > cpu0:timer 44300442 > > > > > cpu1:timer 40561404 > > > > > cpu3:timer 48462822 483058 > > > > > cpu2:timer 48477898 483209 > > > > > > > > Trying to add delays and/or prints around the Timecounters printout > > > > makes the issue go away. Any ideas for debugging? > > > > > > > > Looks like a startup race to me. > > > > > > just picking a random email to reply to, I'm seeing a different issue > > > with > > > early AP startup. It affects one of my four machines, my laptop. My three > > > server systems downstairs have no problem however my laptop will reboot > > > repeatedly at: > > > > > > Jan 17 11:55:16 slippy kernel: cd0: Attempt to query device size failed: > > > NOT READY, Medium not present - tray closed > > > > So it panics and reboots after this? > > Yes, it goes into a panic/reboot loop for a few iterations until it > successfully boots. Disabling early AP startup allows it to boot up without > the assumed race. Can you add DDB to the kernel config (and remove DDB_UNATTENDED) to get it to break into DDB when it panics to get the panic message (and a stack trace as well)? -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Tuesday, January 17, 2017 10:35:06 PM Hans Petter Selasky wrote: > On 01/17/17 22:28, Hans Petter Selasky wrote: > > + state->nextcall = SBT_MAX; > > + state->nextcallopt = now + 1; > > BTW: What locks are protecting the update of these fields? Can they be > written simultaneously by configtimer() and cpu_new_callout()? Both functions do ET_HW_LOCK() of DPCPU_PTR(timerstate). -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Tuesday, January 17, 2017 10:28:47 PM Hans Petter Selasky wrote: > On 01/17/17 20:46, Ian Lepore wrote: > >>> Does this matter for the first tick? How often is configtimer() called? > >> > > >> > As I said, it is called at runtime when profclock is started / stopped, > >> > not > >> > just at boot. Those changes at runtime probably have existing callouts > >> > active and your change will not process any callouts until the next > >> > hardclock > >> > tick fires (but only because you are setting nextcallopt to the bogus > >> > 'next' value). > > On some platforms, configtimer() can be called quite often. Power > > saving modes can change the frequency of the timer, and systems that > > suppport such dynamic frequency scaling call configtimer() > > (via cpu_et_frequency()) to handle the changes. > > Hi, > > I propose the following patch then: > > diff --git a/sys/kern/kern_clocksource.c b/sys/kern/kern_clocksource.c > index 7f7769d..5ae925b 100644 > --- a/sys/kern/kern_clocksource.c > +++ b/sys/kern/kern_clocksource.c > @@ -511,8 +511,13 @@ configtimer(int start) > state->nexthard = next; > state->nextstat = next; > state->nextprof = next; > - state->nextcall = next; > - state->nextcallopt = next; > + /* > +* Force callout_process() to be called > +* instantly, so that the correct value of > +* "nextcall" can be computed: > +*/ > + state->nextcall = SBT_MAX; > + state->nextcallopt = now + 1; > hardclock_sync(cpu); > } > busy = 0; > > > Then there is no problem having to wait for the next tick or anything, > like John Baldwin pointed out. Note that 'nextevent' remains a full 'timerperiod' out (now + timerperiod) and so the first clock interrupt is still 'timerperiod' time away and any callouts are delayed by that amount of time. Also, I think you could set nextcallopt to 'now' rather than 'now + 1'. You might still want to adjust 'nextevent' to schedule the next interrupt to be sooner than 'timerperiod' though. You could just set 'nextevent' to 'now' in that case instead of 'next'. -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
In message <1492450.xzfnz8z...@ralph.baldwin.cx>, John Baldwin writes: > On Tuesday, January 17, 2017 12:53:19 PM Cy Schubert wrote: > > In message , Hans Petter > > Sela > > sky writes: > > > Hi, > > > > > > When booting I observe an additional 30-second delay after this print: > > > > > > > Timecounters tick every 1.000 msec > > > > > > ~30 second delay and boot continues like normal. > > > > > > Checking "vmstat -i" reveals that some timers have been running loose. > > > > > > > cpu0:timer 44300442 > > > > cpu1:timer 40561404 > > > > cpu3:timer 48462822 483058 > > > > cpu2:timer 48477898 483209 > > > > > > Trying to add delays and/or prints around the Timecounters printout > > > makes the issue go away. Any ideas for debugging? > > > > > > Looks like a startup race to me. > > > > just picking a random email to reply to, I'm seeing a different issue with > > early AP startup. It affects one of my four machines, my laptop. My three > > server systems downstairs have no problem however my laptop will reboot > > repeatedly at: > > > > Jan 17 11:55:16 slippy kernel: cd0: Attempt to query device size failed: > > NOT READY, Medium not present - tray closed > > So it panics and reboots after this? Yes, it goes into a panic/reboot loop for a few iterations until it successfully boots. Disabling early AP startup allows it to boot up without the assumed race. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Tuesday, January 17, 2017 12:53:19 PM Cy Schubert wrote: > In message , Hans Petter > Sela > sky writes: > > Hi, > > > > When booting I observe an additional 30-second delay after this print: > > > > > Timecounters tick every 1.000 msec > > > > ~30 second delay and boot continues like normal. > > > > Checking "vmstat -i" reveals that some timers have been running loose. > > > > > cpu0:timer 44300442 > > > cpu1:timer 40561404 > > > cpu3:timer 48462822 483058 > > > cpu2:timer 48477898 483209 > > > > Trying to add delays and/or prints around the Timecounters printout > > makes the issue go away. Any ideas for debugging? > > > > Looks like a startup race to me. > > just picking a random email to reply to, I'm seeing a different issue with > early AP startup. It affects one of my four machines, my laptop. My three > server systems downstairs have no problem however my laptop will reboot > repeatedly at: > > Jan 17 11:55:16 slippy kernel: cd0: Attempt to query device size failed: > NOT READY, Medium not present - tray closed So it panics and reboots after this? -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Tuesday, January 17, 2017 08:31:28 PM Hans Petter Selasky wrote: > Hi, > > On 01/17/17 20:00, John Baldwin wrote: > >> > >> Does this matter for the first tick? How often is configtimer() called? > > > > As I said, it is called at runtime when profclock is started / stopped, not > > just at boot. Those changes at runtime probably have existing callouts > > active and your change will not process any callouts until the next > > hardclock > > tick fires (but only because you are setting nextcallopt to the bogus > > 'next' value). > > > > >> > (One odd thing is that even in your case the first call to > >> handleevents(), > >> > the 'now => state->nextcallout' check in handleevents() should be true > >> > which resets both nextcall and nextcallopt and invokes > >> callout_process().) > >> > >> Let me take you through the failure path, by code inspection: > > > > I would really appreciate it if you could add traces to find out what > > actually > > happens rather than what seems to happen by looking at the code. :-/ > > The problem is that once you add some prints, the problem goes away. > Maybe I should try to set hz to 100 or 25 ??? Maybe use KTR instead and then have a kdb_enter() later in boot so you can use 'show ktr' in DDB. (KTR has less overhead than printfs so might not disrupt the timing as badly.) > > > > 0) cpu_initclocks_bsp() is called and init's nextcall and nexcallopt to > > SBT_MAX > >similar to your change. If no callout is scheduled before configtimer() > >then they remain set to SBT_MAX. Your current patch happens to trigger a > >(bogus) call to callout_process() on the first hardclock() because it > >sets nextcallopt to 'next' even though no callout is actually scheduled > > to > >fire at time 'next'. > > > >> 1) configtimer() is called and we init nextcall and nextcallopt: > >> > >> > next = now + timerperiod; > >> ... > >> > state->nextcall = next; > >> > state->nextcallopt = next; > > > > These both say "the next callout() should fire at 'next' which is the time > > of > > the next hardclock()", even though there may be no callouts scheduled (in > > which > > case both of these fields should be set to SBT_MAX from the call to > > cpu_initclocks_bsp(), or there may be callouts scheduled in which case > > 'nextcall' > > and 'nextcallopt' will reflect the time that those callouts are already > > scheduled for and this overwrites that). > > I see there are some callouts scheduled by SYSINITs, before the first > configtimer(), like NFS_TIMERINIT in nfs_init(). These are setup using > "dummy_timecounter" which means any nextcall values before the first > configtimer should be discarded. Hmm, I actually tested early callouts by having callouts scheduled for 1, 4, and 8 seconds right after callouts were initialized and they worked correctly (albeit using lapic timer as the eventtimer and TSC as the timecounter). By the time they were called though, sbinuptime() was not returning dummy values, but real ones (probably because TSC gets added as a timecounter in SI_SUB_CPU). The patch I used for testing is still in my work branch here: https://github.com/freebsd/freebsd/compare/master...bsdjhb:early_callout So in at least some cases you don't have to discard nextcall during boot. However, if TSC isn't available you might not get a timecounter until later in boot during device probe in which case you would get dummy timecounter, and then the nextcall/nextcallopt values aren't great. Hmmm, I wonder if just bumping nextcall to be 'now' in case it is less than 'now' would be sufficient. I think my previous patch still looped even though it might have set 'next_event' correctly because the 'nextcall' value was still too small. That is: Index: kern_clocksource.c === --- kern_clocksource.c (revision 312301) +++ kern_clocksource.c (working copy) @@ -498,12 +498,18 @@ configtimer(int start) CPU_FOREACH(cpu) { state = DPCPU_ID_PTR(cpu, timerstate); state->now = now; + printf("%s: CPU %d: now %jd nextcall %jd nextcallopt %jd next %jd\n", __func__, cpu, state->nextcall, state->nextcall, next); + if (state->nextcall < now) + state->nextcall = now; #ifndef EARLY_AP_STARTUP if (!smp_started && cpu != CPU_FIRST()) state->nextevent = SBT_MAX; else #endif + if (next < state->nextcall) state->nextevent = next; + else + state->nextevent = state->nextcall; if (periodic) state->nexttick = next; else @@ -511,8 +517,6 @@ configtimer(
Re: Strange issue after early AP startup
On 01/17/17 22:28, Hans Petter Selasky wrote: + state->nextcall = SBT_MAX; + state->nextcallopt = now + 1; BTW: What locks are protecting the update of these fields? Can they be written simultaneously by configtimer() and cpu_new_callout()? --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On 01/17/17 20:46, Ian Lepore wrote: Does this matter for the first tick? How often is configtimer() called? > > As I said, it is called at runtime when profclock is started / stopped, not > just at boot. Those changes at runtime probably have existing callouts > active and your change will not process any callouts until the next hardclock > tick fires (but only because you are setting nextcallopt to the bogus > 'next' value). On some platforms, configtimer() can be called quite often. Power saving modes can change the frequency of the timer, and systems that suppport such dynamic frequency scaling call configtimer() (via cpu_et_frequency()) to handle the changes. Hi, I propose the following patch then: diff --git a/sys/kern/kern_clocksource.c b/sys/kern/kern_clocksource.c index 7f7769d..5ae925b 100644 --- a/sys/kern/kern_clocksource.c +++ b/sys/kern/kern_clocksource.c @@ -511,8 +511,13 @@ configtimer(int start) state->nexthard = next; state->nextstat = next; state->nextprof = next; - state->nextcall = next; - state->nextcallopt = next; + /* +* Force callout_process() to be called +* instantly, so that the correct value of +* "nextcall" can be computed: +*/ + state->nextcall = SBT_MAX; + state->nextcallopt = now + 1; hardclock_sync(cpu); } busy = 0; Then there is no problem having to wait for the next tick or anything, like John Baldwin pointed out. --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
In message , Hans Petter Sela sky writes: > Hi, > > When booting I observe an additional 30-second delay after this print: > > > Timecounters tick every 1.000 msec > > ~30 second delay and boot continues like normal. > > Checking "vmstat -i" reveals that some timers have been running loose. > > > cpu0:timer 44300442 > > cpu1:timer 40561404 > > cpu3:timer 48462822 483058 > > cpu2:timer 48477898 483209 > > Trying to add delays and/or prints around the Timecounters printout > makes the issue go away. Any ideas for debugging? > > Looks like a startup race to me. just picking a random email to reply to, I'm seeing a different issue with early AP startup. It affects one of my four machines, my laptop. My three server systems downstairs have no problem however my laptop will reboot repeatedly at: Jan 17 11:55:16 slippy kernel: cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed Then finally boot after a number of reboots (0-N), it finally boots. Disabling early AP start allows it to boot past that point first time. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Tue, 2017-01-17 at 11:00 -0800, John Baldwin wrote: > > > You could > > > do that by setting it to 'cc_firstevent' of the associated CPU, but in > > > practice 'state->nextcall' should already be set to that (it is > > initalized > > > to SBT_MAX in cpu_initclocks_bsp() and is then only set to other > > values due > > > to cpu_new_callout()). Keep in mind that configtimer() is not just > > called > > > from boot, but is also invoked when starting/stopping the profiling > > timer. > > > > > > > > However, when setting 'nextevent' (which is used to schedule the next > > timer > > > interrupt), we should be honoring the existing 'nextcall' if it is sooner > > > than the next hardclock. > > > > Does this matter for the first tick? How often is configtimer() called? > > As I said, it is called at runtime when profclock is started / stopped, not > just at boot. Those changes at runtime probably have existing callouts > active and your change will not process any callouts until the next hardclock > tick fires (but only because you are setting nextcallopt to the bogus > 'next' value). On some platforms, configtimer() can be called quite often. Power saving modes can change the frequency of the timer, and systems that suppport such dynamic frequency scaling call configtimer() (via cpu_et_frequency()) to handle the changes. -- Ian ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
Hi, On 01/17/17 20:00, John Baldwin wrote: Does this matter for the first tick? How often is configtimer() called? As I said, it is called at runtime when profclock is started / stopped, not just at boot. Those changes at runtime probably have existing callouts active and your change will not process any callouts until the next hardclock tick fires (but only because you are setting nextcallopt to the bogus 'next' value). > (One odd thing is that even in your case the first call to handleevents(), > the 'now => state->nextcallout' check in handleevents() should be true > which resets both nextcall and nextcallopt and invokes callout_process().) Let me take you through the failure path, by code inspection: I would really appreciate it if you could add traces to find out what actually happens rather than what seems to happen by looking at the code. :-/ The problem is that once you add some prints, the problem goes away. Maybe I should try to set hz to 100 or 25 ??? 0) cpu_initclocks_bsp() is called and init's nextcall and nexcallopt to SBT_MAX similar to your change. If no callout is scheduled before configtimer() then they remain set to SBT_MAX. Your current patch happens to trigger a (bogus) call to callout_process() on the first hardclock() because it sets nextcallopt to 'next' even though no callout is actually scheduled to fire at time 'next'. 1) configtimer() is called and we init nextcall and nextcallopt: > next = now + timerperiod; ... > state->nextcall = next; > state->nextcallopt = next; These both say "the next callout() should fire at 'next' which is the time of the next hardclock()", even though there may be no callouts scheduled (in which case both of these fields should be set to SBT_MAX from the call to cpu_initclocks_bsp(), or there may be callouts scheduled in which case 'nextcall' and 'nextcallopt' will reflect the time that those callouts are already scheduled for and this overwrites that). I see there are some callouts scheduled by SYSINITs, before the first configtimer(), like NFS_TIMERINIT in nfs_init(). These are setup using "dummy_timecounter" which means any nextcall values before the first configtimer should be discarded. 2) Any callout_reset() calls cpu_new_callout(): > */ > state->nextcallopt = bt_opt; > if (bt >= state->nextcall) We follow this path, because "bt" is surely based on sbinuptime() and is greater or equal to state->nextcall. Note that state->nextcallopt is updated to bt_opt, which is in the future. Note, my patch should _leave_ nextcall at SBT_MAX (from cpu_initclocks_bsp()) unless there was already an earlier call to callout_reset(). Yes, there are calls to callout_reset(). See for example NFS_TIMERINIT, like mentioned above. > IOW, it should be a NOP for the purposes of this branch compared with your change. (You could add a warning to print out if 'nextcall' != SBT_MAX during boot and see if it fires for example.) > goto done; > state->nextcall = bt; 3) getnextcpuevent(0) is called by the fast timercb() to setup the next event: > state = DPCPU_PTR(timerstate); > /* Handle hardclock() events, skipping some if CPU is idle. */ > event = state->nexthard; ... > /* Handle callout events. */ > if (event > state->nextcall) We then go looping into this path, because state->nextcall is still equal to "next" as in step 1) which is now in the past, until "now >= state->nextcallopt" inside handleevents(), which clears this condition. > event = state->nextcall; ... > return (event); I'm curious if there is a callout_reset() that has set 'nextcall' to a time that is effectively before 'now'. Maybe add a printf like this: Index: kern_clocksource.c === --- kern_clocksource.c (revision 312301) +++ kern_clocksource.c (working copy) @@ -498,12 +498,18 @@ configtimer(int start) CPU_FOREACH(cpu) { state = DPCPU_ID_PTR(cpu, timerstate); state->now = now; + printf("%s: CPU %d: now %jd nextcall %jd nextcallopt %jd next %jd\n", __func__, cpu, state->nextcall, state->nextcall, next); #ifndef EARLY_AP_STARTUP if (!smp_started && cpu != CPU_FIRST()) state->nextevent = SBT_MAX; else #endif In particular what I am worried about with your patch is that for post-boot calls to configtimer() you will delay any previously-scheduled callouts until the next hardclock. I understand. Would a solution be to refactor callout_process(), to accept the PCPU_GET(CPUID) as an argument and be executed for all CPUs by configtimer(), instead of trying to guess state->nextcall and state->nextcallopt in configtimer() ?
Re: Strange issue after early AP startup
On Tuesday, January 17, 2017 07:04:15 PM Hans Petter Selasky wrote: > On 01/17/17 16:50, John Baldwin wrote: > > On Monday, January 16, 2017 10:10:16 PM Hans Petter Selasky wrote: > >> On 01/16/17 20:31, John Baldwin wrote: > >>> On Monday, January 16, 2017 04:51:42 PM Hans Petter Selasky wrote: > Hi, > > When booting I observe an additional 30-second delay after this print: > > > Timecounters tick every 1.000 msec > > ~30 second delay and boot continues like normal. > > Checking "vmstat -i" reveals that some timers have been running loose. > > > cpu0:timer 44300442 > > cpu1:timer 40561404 > > cpu3:timer 48462822 483058 > > cpu2:timer 48477898 483209 > > Trying to add delays and/or prints around the Timecounters printout > makes the issue go away. Any ideas for debugging? > >>> > >>> I have generally used KTR tracing to trace what is happening during > >>> boot to debug EARLY_AP_STARTUP issues. > >>> > >> > >> Hi John, > >> > >> What happens is that getnextcpuevent(0) keeps on returning > >> "state->nextcall" which is in the past for CPU #2 and #3 on my box. > >> > >> In "cpu_new_callout()" there is a check if "bt >= state->nextcall", > >> which I suspect is true, so "state->nextcall" never gets set to real > >> minimum sbintime. > >> > >> The attached patch fixes the problem for me, but I'm not 100% sure > if it > >> is correct. > > > > Hi, > > > I think we want to be honoring any currently scheduled callouts. > > The problem here is that we might be changing the clocksource, then > sbinuptime() will change too, so I think the value should be reset by > configtimer() and then corrected at the next call to callout_process(). > > > You could > > do that by setting it to 'cc_firstevent' of the associated CPU, but in > > practice 'state->nextcall' should already be set to that (it is > initalized > > to SBT_MAX in cpu_initclocks_bsp() and is then only set to other > values due > > to cpu_new_callout()). Keep in mind that configtimer() is not just > called > > from boot, but is also invoked when starting/stopping the profiling > timer. > > > > > However, when setting 'nextevent' (which is used to schedule the next > timer > > interrupt), we should be honoring the existing 'nextcall' if it is sooner > > than the next hardclock. > > Does this matter for the first tick? How often is configtimer() called? As I said, it is called at runtime when profclock is started / stopped, not just at boot. Those changes at runtime probably have existing callouts active and your change will not process any callouts until the next hardclock tick fires (but only because you are setting nextcallopt to the bogus 'next' value). > > (One odd thing is that even in your case the first call to > handleevents(), > > the 'now => state->nextcallout' check in handleevents() should be true > > which resets both nextcall and nextcallopt and invokes > callout_process().) > > Let me take you through the failure path, by code inspection: I would really appreciate it if you could add traces to find out what actually happens rather than what seems to happen by looking at the code. :-/ 0) cpu_initclocks_bsp() is called and init's nextcall and nexcallopt to SBT_MAX similar to your change. If no callout is scheduled before configtimer() then they remain set to SBT_MAX. Your current patch happens to trigger a (bogus) call to callout_process() on the first hardclock() because it sets nextcallopt to 'next' even though no callout is actually scheduled to fire at time 'next'. > 1) configtimer() is called and we init nextcall and nextcallopt: > > > next = now + timerperiod; > ... > > state->nextcall = next; > > state->nextcallopt = next; These both say "the next callout() should fire at 'next' which is the time of the next hardclock()", even though there may be no callouts scheduled (in which case both of these fields should be set to SBT_MAX from the call to cpu_initclocks_bsp(), or there may be callouts scheduled in which case 'nextcall' and 'nextcallopt' will reflect the time that those callouts are already scheduled for and this overwrites that). > 2) Any callout_reset() calls cpu_new_callout(): > > > */ > > state->nextcallopt = bt_opt; > > if (bt >= state->nextcall) > We follow this path, because "bt" is surely based on sbinuptime() and is > greater or equal to state->nextcall. Note that state->nextcallopt is > updated to bt_opt, which is in the future. Note, my patch should _leave_ nextcall at SBT_MAX (from cpu_initclocks_bsp()) unless there was already an earlier call to callout_reset(). IOW, it should be a NOP for the purposes of this branch compared with your change. (
Re: Strange issue after early AP startup
On 01/17/17 16:50, John Baldwin wrote: > On Monday, January 16, 2017 10:10:16 PM Hans Petter Selasky wrote: >> On 01/16/17 20:31, John Baldwin wrote: >>> On Monday, January 16, 2017 04:51:42 PM Hans Petter Selasky wrote: Hi, When booting I observe an additional 30-second delay after this print: > Timecounters tick every 1.000 msec ~30 second delay and boot continues like normal. Checking "vmstat -i" reveals that some timers have been running loose. > cpu0:timer 44300442 > cpu1:timer 40561404 > cpu3:timer 48462822 483058 > cpu2:timer 48477898 483209 Trying to add delays and/or prints around the Timecounters printout makes the issue go away. Any ideas for debugging? >>> >>> I have generally used KTR tracing to trace what is happening during >>> boot to debug EARLY_AP_STARTUP issues. >>> >> >> Hi John, >> >> What happens is that getnextcpuevent(0) keeps on returning >> "state->nextcall" which is in the past for CPU #2 and #3 on my box. >> >> In "cpu_new_callout()" there is a check if "bt >= state->nextcall", >> which I suspect is true, so "state->nextcall" never gets set to real >> minimum sbintime. >> >> The attached patch fixes the problem for me, but I'm not 100% sure if it >> is correct. > Hi, > I think we want to be honoring any currently scheduled callouts. The problem here is that we might be changing the clocksource, then sbinuptime() will change too, so I think the value should be reset by configtimer() and then corrected at the next call to callout_process(). > You could > do that by setting it to 'cc_firstevent' of the associated CPU, but in > practice 'state->nextcall' should already be set to that (it is initalized > to SBT_MAX in cpu_initclocks_bsp() and is then only set to other values due > to cpu_new_callout()). Keep in mind that configtimer() is not just called > from boot, but is also invoked when starting/stopping the profiling timer. > > However, when setting 'nextevent' (which is used to schedule the next timer > interrupt), we should be honoring the existing 'nextcall' if it is sooner > than the next hardclock. Does this matter for the first tick? How often is configtimer() called? > (One odd thing is that even in your case the first call to handleevents(), > the 'now => state->nextcallout' check in handleevents() should be true > which resets both nextcall and nextcallopt and invokes callout_process().) Let me take you through the failure path, by code inspection: 1) configtimer() is called and we init nextcall and nextcallopt: > next = now + timerperiod; ... > state->nextcall = next; > state->nextcallopt = next; 2) Any callout_reset() calls cpu_new_callout(): > */ > state->nextcallopt = bt_opt; > if (bt >= state->nextcall) We follow this path, because "bt" is surely based on sbinuptime() and is greater or equal to state->nextcall. Note that state->nextcallopt is updated to bt_opt, which is in the future. > goto done; > state->nextcall = bt; 3) getnextcpuevent(0) is called by the fast timercb() to setup the next event: > state = DPCPU_PTR(timerstate); > /* Handle hardclock() events, skipping some if CPU is idle. */ > event = state->nexthard; ... > /* Handle callout events. */ > if (event > state->nextcall) We then go looping into this path, because state->nextcall is still equal to "next" as in step 1) which is now in the past, until "now >= state->nextcallopt" inside handleevents(), which clears this condition. > event = state->nextcall; ... > return (event); --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On 01/17/17 16:50, John Baldwin wrote: Index: kern_clocksource.c === --- kern_clocksource.c (revision 312301) +++ kern_clocksource.c (working copy) @@ -503,7 +503,12 @@ configtimer(int start) state->nextevent = SBT_MAX; else #endif + if (next < state->nextcall) state->nextevent = next; + else if (state->nextcall < now) + state->nextevent = now; + else + state->nextevent = state->nextcall; if (periodic) state->nexttick = next; else @@ -511,8 +516,6 @@ configtimer(int start) state->nexthard = next; state->nextstat = next; state->nextprof = next; - state->nextcall = next; - state->nextcallopt = next; hardclock_sync(cpu); } busy = 0; This patch makes it worse. Now I don't even reach the login prompt. --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On 01/17/17 16:50, John Baldwin wrote: (One odd thing is that even in your case the first call to handleevents(), the 'now => state->nextcallout' check in handleevents() should be true which resets both nextcall and nextcallopt and invokes callout_process().) Hi, I suspect the cpu_new_callout() function is changing this condition after a callout_reset() call, before handleevents() gets a chance to run. I'll give your patch a spin right away and let you know how it goes. --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Monday, January 16, 2017 10:10:16 PM Hans Petter Selasky wrote: > On 01/16/17 20:31, John Baldwin wrote: > > On Monday, January 16, 2017 04:51:42 PM Hans Petter Selasky wrote: > >> Hi, > >> > >> When booting I observe an additional 30-second delay after this print: > >> > >>> Timecounters tick every 1.000 msec > >> > >> ~30 second delay and boot continues like normal. > >> > >> Checking "vmstat -i" reveals that some timers have been running loose. > >> > >>> cpu0:timer 44300442 > >>> cpu1:timer 40561404 > >>> cpu3:timer 48462822 483058 > >>> cpu2:timer 48477898 483209 > >> > >> Trying to add delays and/or prints around the Timecounters printout > >> makes the issue go away. Any ideas for debugging? > > > > I have generally used KTR tracing to trace what is happening during > > boot to debug EARLY_AP_STARTUP issues. > > > > Hi John, > > What happens is that getnextcpuevent(0) keeps on returning > "state->nextcall" which is in the past for CPU #2 and #3 on my box. > > In "cpu_new_callout()" there is a check if "bt >= state->nextcall", > which I suspect is true, so "state->nextcall" never gets set to real > minimum sbintime. > > The attached patch fixes the problem for me, but I'm not 100% sure if it > is correct. I think we want to be honoring any currently scheduled callouts. You could do that by setting it to 'cc_firstevent' of the associated CPU, but in practice 'state->nextcall' should already be set to that (it is initalized to SBT_MAX in cpu_initclocks_bsp() and is then only set to other values due to cpu_new_callout()). Keep in mind that configtimer() is not just called from boot, but is also invoked when starting/stopping the profiling timer. However, when setting 'nextevent' (which is used to schedule the next timer interrupt), we should be honoring the existing 'nextcall' if it is sooner than the next hardclock. (One odd thing is that even in your case the first call to handleevents(), the 'now => state->nextcallout' check in handleevents() should be true which resets both nextcall and nextcallopt and invokes callout_process().) Here is a suggestion that attempts what I described in the first paragraph. If you still get hangs it would be good to break into DDB and capture the output of 'show clocksource'. Index: kern_clocksource.c === --- kern_clocksource.c (revision 312301) +++ kern_clocksource.c (working copy) @@ -503,7 +503,12 @@ configtimer(int start) state->nextevent = SBT_MAX; else #endif + if (next < state->nextcall) state->nextevent = next; + else if (state->nextcall < now) + state->nextevent = now; + else + state->nextevent = state->nextcall; if (periodic) state->nexttick = next; else @@ -511,8 +516,6 @@ configtimer(int start) state->nexthard = next; state->nextstat = next; state->nextprof = next; - state->nextcall = next; - state->nextcallopt = next; hardclock_sync(cpu); } busy = 0; -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On 01/16/17 20:31, John Baldwin wrote: On Monday, January 16, 2017 04:51:42 PM Hans Petter Selasky wrote: Hi, When booting I observe an additional 30-second delay after this print: Timecounters tick every 1.000 msec ~30 second delay and boot continues like normal. Checking "vmstat -i" reveals that some timers have been running loose. cpu0:timer 44300442 cpu1:timer 40561404 cpu3:timer 48462822 483058 cpu2:timer 48477898 483209 Trying to add delays and/or prints around the Timecounters printout makes the issue go away. Any ideas for debugging? I have generally used KTR tracing to trace what is happening during boot to debug EARLY_AP_STARTUP issues. Hi John, What happens is that getnextcpuevent(0) keeps on returning "state->nextcall" which is in the past for CPU #2 and #3 on my box. In "cpu_new_callout()" there is a check if "bt >= state->nextcall", which I suspect is true, so "state->nextcall" never gets set to real minimum sbintime. The attached patch fixes the problem for me, but I'm not 100% sure if it is correct. --HPS diff --git a/sys/kern/kern_clocksource.c b/sys/kern/kern_clocksource.c index 7f7769d..454a130 100644 --- a/sys/kern/kern_clocksource.c +++ b/sys/kern/kern_clocksource.c @@ -511,7 +511,7 @@ configtimer(int start) state->nexthard = next; state->nextstat = next; state->nextprof = next; - state->nextcall = next; + state->nextcall = SBT_MAX; state->nextcallopt = next; hardclock_sync(cpu); } ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Monday, January 16, 2017 04:51:42 PM Hans Petter Selasky wrote: > Hi, > > When booting I observe an additional 30-second delay after this print: > > > Timecounters tick every 1.000 msec > > ~30 second delay and boot continues like normal. > > Checking "vmstat -i" reveals that some timers have been running loose. > > > cpu0:timer 44300442 > > cpu1:timer 40561404 > > cpu3:timer 48462822 483058 > > cpu2:timer 48477898 483209 > > Trying to add delays and/or prints around the Timecounters printout > makes the issue go away. Any ideas for debugging? I have generally used KTR tracing to trace what is happening during boot to debug EARLY_AP_STARTUP issues. -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Strange issue after early AP startup
Hi, When booting I observe an additional 30-second delay after this print: Timecounters tick every 1.000 msec ~30 second delay and boot continues like normal. Checking "vmstat -i" reveals that some timers have been running loose. cpu0:timer 44300442 cpu1:timer 40561404 cpu3:timer 48462822 483058 cpu2:timer 48477898 483209 Trying to add delays and/or prints around the Timecounters printout makes the issue go away. Any ideas for debugging? Looks like a startup race to me. --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"