Re: Strange issue after early AP startup
Hi, On 01/18/17 02:18, John Baldwin wrote: Also, I think you could set nextcallopt to 'now' rather than 'now + 1'. There is a check in loadtimer() if next == now, and then the event timer is not started ?? } else { new = getnextevent(); eq = (new == *next); CTR4(KTR_SPARE2, "load at %d:next %d.%08x eq %d", curcpu, (int)(new >> 32), (u_int)(new & 0x), eq); if (!eq) { *next = new; et_start(timer, new - now, 0); } } --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
ISO image: where is the CLANG compiler?
I ran into a very nasty situation where I need to save/restore/reinstall a in-installworld-crashed recent current. While the /usr/obj and /usr/src as well as /etc folders are intact (residing on a Samsung 850 pro SSD with UFS and journaling), /boot/kernel vanished and most binaries in /bin and /sbin are of Null size. I treid to rescue the system by intending to use the most recent CURRENT ISO image found on the snapshot server for USB drives, booted this successfully and then mounted the failes filesystems into the proper place (/usr/obj and /usr/src onto USB devices /usr/obj and /usr/src respectively, the rest goes into /mnt). I tried then to perform a make installworld with DESTDIR=/mnt set. But I fail: the minimalistic USB image does not have any CLANG/LLVM stuff required for the rescue! Where the hell did this stuff go? Has it been ripped off due to the 1 GB ancient flash size? Help is needed. I've already posted to CURRENT a message, but I guess I always hit the wrong subject line. It seems that the key to my saviour is to have a flash drive with a recent CURRENT containing a cc compiler - otherwise /usr/obj is useless. Kind reards, Oliver ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: recent change to vim defaults?
On 17/01/2017 12:07 AM, ohauer wrote: I suspect you mean the /usr/local/etc/vim/vimrc and gvimrc files. That was the first place I've tried to overwrite it, but without luck (even with set mouse=) but it works in ~/.vimrc what to put IN the file? -- olli -- send with broken GMX mailer client, sorry for tofu and html scrap On 15/01/2017, 22:48 Benjamin Kaduk wrote: On Mon, Jan 16, 2017 at 12:03:08AM +0800, Julian Elischer wrote: > I noticed that suddenly vim is grabbing mouse movements, which makes > life really hard. > > Was there a specific revision that brought in this change, and can it > be removed? I remember seeing something go by during an upgrade somewhat recently about there now being a defaults file that gets used when a user does not specify a .vimrc. Unfortunately, I don't remember whether I saw that notice on a FreeBSD machine or a Debian one, and haven't been able to find the notice I remember through searching some likely places. Just to check: do you have a .vimrc file in place already? not yet. when I work out what to put into it I will make it. -Ben ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Tuesday, January 17, 2017 05:08:58 PM Cy Schubert wrote: > In message <1492450.xzfnz8z...@ralph.baldwin.cx>, John Baldwin writes: > > On Tuesday, January 17, 2017 12:53:19 PM Cy Schubert wrote: > > > In message , Hans > > > Petter > > > Sela > > > sky writes: > > > > Hi, > > > > > > > > When booting I observe an additional 30-second delay after this print: > > > > > > > > > Timecounters tick every 1.000 msec > > > > > > > > ~30 second delay and boot continues like normal. > > > > > > > > Checking "vmstat -i" reveals that some timers have been running loose. > > > > > > > > > cpu0:timer 44300442 > > > > > cpu1:timer 40561404 > > > > > cpu3:timer 48462822 483058 > > > > > cpu2:timer 48477898 483209 > > > > > > > > Trying to add delays and/or prints around the Timecounters printout > > > > makes the issue go away. Any ideas for debugging? > > > > > > > > Looks like a startup race to me. > > > > > > just picking a random email to reply to, I'm seeing a different issue > > > with > > > early AP startup. It affects one of my four machines, my laptop. My three > > > server systems downstairs have no problem however my laptop will reboot > > > repeatedly at: > > > > > > Jan 17 11:55:16 slippy kernel: cd0: Attempt to query device size failed: > > > NOT READY, Medium not present - tray closed > > > > So it panics and reboots after this? > > Yes, it goes into a panic/reboot loop for a few iterations until it > successfully boots. Disabling early AP startup allows it to boot up without > the assumed race. Can you add DDB to the kernel config (and remove DDB_UNATTENDED) to get it to break into DDB when it panics to get the panic message (and a stack trace as well)? -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Tuesday, January 17, 2017 10:35:06 PM Hans Petter Selasky wrote: > On 01/17/17 22:28, Hans Petter Selasky wrote: > > + state->nextcall = SBT_MAX; > > + state->nextcallopt = now + 1; > > BTW: What locks are protecting the update of these fields? Can they be > written simultaneously by configtimer() and cpu_new_callout()? Both functions do ET_HW_LOCK() of DPCPU_PTR(timerstate). -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Tuesday, January 17, 2017 10:28:47 PM Hans Petter Selasky wrote: > On 01/17/17 20:46, Ian Lepore wrote: > >>> Does this matter for the first tick? How often is configtimer() called? > >> > > >> > As I said, it is called at runtime when profclock is started / stopped, > >> > not > >> > just at boot. Those changes at runtime probably have existing callouts > >> > active and your change will not process any callouts until the next > >> > hardclock > >> > tick fires (but only because you are setting nextcallopt to the bogus > >> > 'next' value). > > On some platforms, configtimer() can be called quite often. Power > > saving modes can change the frequency of the timer, and systems that > > suppport such dynamic frequency scaling call configtimer() > > (via cpu_et_frequency()) to handle the changes. > > Hi, > > I propose the following patch then: > > diff --git a/sys/kern/kern_clocksource.c b/sys/kern/kern_clocksource.c > index 7f7769d..5ae925b 100644 > --- a/sys/kern/kern_clocksource.c > +++ b/sys/kern/kern_clocksource.c > @@ -511,8 +511,13 @@ configtimer(int start) > state->nexthard = next; > state->nextstat = next; > state->nextprof = next; > - state->nextcall = next; > - state->nextcallopt = next; > + /* > +* Force callout_process() to be called > +* instantly, so that the correct value of > +* "nextcall" can be computed: > +*/ > + state->nextcall = SBT_MAX; > + state->nextcallopt = now + 1; > hardclock_sync(cpu); > } > busy = 0; > > > Then there is no problem having to wait for the next tick or anything, > like John Baldwin pointed out. Note that 'nextevent' remains a full 'timerperiod' out (now + timerperiod) and so the first clock interrupt is still 'timerperiod' time away and any callouts are delayed by that amount of time. Also, I think you could set nextcallopt to 'now' rather than 'now + 1'. You might still want to adjust 'nextevent' to schedule the next interrupt to be sooner than 'timerperiod' though. You could just set 'nextevent' to 'now' in that case instead of 'next'. -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
In message <1492450.xzfnz8z...@ralph.baldwin.cx>, John Baldwin writes: > On Tuesday, January 17, 2017 12:53:19 PM Cy Schubert wrote: > > In message , Hans Petter > > Sela > > sky writes: > > > Hi, > > > > > > When booting I observe an additional 30-second delay after this print: > > > > > > > Timecounters tick every 1.000 msec > > > > > > ~30 second delay and boot continues like normal. > > > > > > Checking "vmstat -i" reveals that some timers have been running loose. > > > > > > > cpu0:timer 44300442 > > > > cpu1:timer 40561404 > > > > cpu3:timer 48462822 483058 > > > > cpu2:timer 48477898 483209 > > > > > > Trying to add delays and/or prints around the Timecounters printout > > > makes the issue go away. Any ideas for debugging? > > > > > > Looks like a startup race to me. > > > > just picking a random email to reply to, I'm seeing a different issue with > > early AP startup. It affects one of my four machines, my laptop. My three > > server systems downstairs have no problem however my laptop will reboot > > repeatedly at: > > > > Jan 17 11:55:16 slippy kernel: cd0: Attempt to query device size failed: > > NOT READY, Medium not present - tray closed > > So it panics and reboots after this? Yes, it goes into a panic/reboot loop for a few iterations until it successfully boots. Disabling early AP startup allows it to boot up without the assumed race. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: NFS 4.1
The vmware client will not work with the FreeBSD server at this time. It does a ReclaimComplete with file system boolean set ``true``. This isn`t supported by the FreeBSD server at this time. (vmware is the only client that does this, as far as I am know.) The fix is probably simple, but since I don`t have access to vmware and those that reported it haven`t been able to give me the information I need If you are willing to test a couple of simple patches for the server in order to resolve this, email and I`ll send them to you, rick From: owner-freebsd-curr...@freebsd.org on behalf of Michael Ware Sent: Tuesday, January 17, 2017 1:16:06 PM To: Russell L. Carter Cc: freebsd-current@freebsd.org Subject: Re: NFS 4.1 Thanks for the reply Russell, I'm looking to set up a 4.1 server in order to host vmware images. I have set up an exports but I get an error stating NFS 4 is not supported when trying to attach it in VM storage. Is there any documentation for setting this up? Thanks Mike On Tue, Jan 17, 2017 at 10:02 AM, Russell L. Carter wrote: > On 01/17/17 10:38, Michael Ware wrote: > >> Good day, >> Does anyone know if NFS 4.1 (not 4.0) is available in FreeBSD 11? I have >> not been able to find any documentation around this. >> Thanks >> >> > Yes, though I'm not sure what specific feature you're looking for. > FreeBSD interoperates with my linux NFS 4.1 servers and clients > just fine. > > man nfsv4 > > $ cat ~/bin/knuth-mount > #! /bin/sh > > # man mount_nfs > MOUNT="mount_nfs -o nfsv4,minorversion=1" > #MOUNT="mount_nfs -o nfsv3" > > NFS_SERVER_HOST=knuth > $MOUNT $NFS_SERVER_HOST:/export/packages /mnt/$NFS_SERVER_HOST/packages > $MOUNT $NFS_SERVER_HOST:/usr/src /usr/src > $MOUNT $NFS_SERVER_HOST:/usr/obj /usr/obj > > HTH, > Russell > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > -- Michael Ware UCSC Baskin Engineering Unix, Network and Security 406-210-4725 ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Tuesday, January 17, 2017 12:53:19 PM Cy Schubert wrote: > In message , Hans Petter > Sela > sky writes: > > Hi, > > > > When booting I observe an additional 30-second delay after this print: > > > > > Timecounters tick every 1.000 msec > > > > ~30 second delay and boot continues like normal. > > > > Checking "vmstat -i" reveals that some timers have been running loose. > > > > > cpu0:timer 44300442 > > > cpu1:timer 40561404 > > > cpu3:timer 48462822 483058 > > > cpu2:timer 48477898 483209 > > > > Trying to add delays and/or prints around the Timecounters printout > > makes the issue go away. Any ideas for debugging? > > > > Looks like a startup race to me. > > just picking a random email to reply to, I'm seeing a different issue with > early AP startup. It affects one of my four machines, my laptop. My three > server systems downstairs have no problem however my laptop will reboot > repeatedly at: > > Jan 17 11:55:16 slippy kernel: cd0: Attempt to query device size failed: > NOT READY, Medium not present - tray closed So it panics and reboots after this? -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Tuesday, January 17, 2017 08:31:28 PM Hans Petter Selasky wrote: > Hi, > > On 01/17/17 20:00, John Baldwin wrote: > >> > >> Does this matter for the first tick? How often is configtimer() called? > > > > As I said, it is called at runtime when profclock is started / stopped, not > > just at boot. Those changes at runtime probably have existing callouts > > active and your change will not process any callouts until the next > > hardclock > > tick fires (but only because you are setting nextcallopt to the bogus > > 'next' value). > > > > >> > (One odd thing is that even in your case the first call to > >> handleevents(), > >> > the 'now => state->nextcallout' check in handleevents() should be true > >> > which resets both nextcall and nextcallopt and invokes > >> callout_process().) > >> > >> Let me take you through the failure path, by code inspection: > > > > I would really appreciate it if you could add traces to find out what > > actually > > happens rather than what seems to happen by looking at the code. :-/ > > The problem is that once you add some prints, the problem goes away. > Maybe I should try to set hz to 100 or 25 ??? Maybe use KTR instead and then have a kdb_enter() later in boot so you can use 'show ktr' in DDB. (KTR has less overhead than printfs so might not disrupt the timing as badly.) > > > > 0) cpu_initclocks_bsp() is called and init's nextcall and nexcallopt to > > SBT_MAX > >similar to your change. If no callout is scheduled before configtimer() > >then they remain set to SBT_MAX. Your current patch happens to trigger a > >(bogus) call to callout_process() on the first hardclock() because it > >sets nextcallopt to 'next' even though no callout is actually scheduled > > to > >fire at time 'next'. > > > >> 1) configtimer() is called and we init nextcall and nextcallopt: > >> > >> > next = now + timerperiod; > >> ... > >> > state->nextcall = next; > >> > state->nextcallopt = next; > > > > These both say "the next callout() should fire at 'next' which is the time > > of > > the next hardclock()", even though there may be no callouts scheduled (in > > which > > case both of these fields should be set to SBT_MAX from the call to > > cpu_initclocks_bsp(), or there may be callouts scheduled in which case > > 'nextcall' > > and 'nextcallopt' will reflect the time that those callouts are already > > scheduled for and this overwrites that). > > I see there are some callouts scheduled by SYSINITs, before the first > configtimer(), like NFS_TIMERINIT in nfs_init(). These are setup using > "dummy_timecounter" which means any nextcall values before the first > configtimer should be discarded. Hmm, I actually tested early callouts by having callouts scheduled for 1, 4, and 8 seconds right after callouts were initialized and they worked correctly (albeit using lapic timer as the eventtimer and TSC as the timecounter). By the time they were called though, sbinuptime() was not returning dummy values, but real ones (probably because TSC gets added as a timecounter in SI_SUB_CPU). The patch I used for testing is still in my work branch here: https://github.com/freebsd/freebsd/compare/master...bsdjhb:early_callout So in at least some cases you don't have to discard nextcall during boot. However, if TSC isn't available you might not get a timecounter until later in boot during device probe in which case you would get dummy timecounter, and then the nextcall/nextcallopt values aren't great. Hmmm, I wonder if just bumping nextcall to be 'now' in case it is less than 'now' would be sufficient. I think my previous patch still looped even though it might have set 'next_event' correctly because the 'nextcall' value was still too small. That is: Index: kern_clocksource.c === --- kern_clocksource.c (revision 312301) +++ kern_clocksource.c (working copy) @@ -498,12 +498,18 @@ configtimer(int start) CPU_FOREACH(cpu) { state = DPCPU_ID_PTR(cpu, timerstate); state->now = now; + printf("%s: CPU %d: now %jd nextcall %jd nextcallopt %jd next %jd\n", __func__, cpu, state->nextcall, state->nextcall, next); + if (state->nextcall < now) + state->nextcall = now; #ifndef EARLY_AP_STARTUP if (!smp_started && cpu != CPU_FIRST()) state->nextevent = SBT_MAX; else #endif + if (next < state->nextcall) state->nextevent = next; + else + state->nextevent = state->nextcall; if (periodic) state->nexttick = next; else @@ -511,8 +517,6 @@ configtimer(
Help! Howto installworld crashed system with USB image?
Within the past several hours, FreeBSD crashed due to serious bugs and some boxes of ours hang with the uncomplete workaround with EARLY_AP_STARTUP. During a recompilation and installworld/installkernel, one of my workstations suddenly crashed and spontaneously rebooted. After that, the loader complained about "not kernel" and left me alone at the "OK " prompt of the bootloader. After some fast investigations I realized, that except /bin/sh all files on the SSD (Samsung 850 PRO, crashed kernel had NANDFS option enabled as well as device nandfs, if this is of interest, but I doubt it). Since the whole SSD is so far intact including the /usr/src and /usr/obj and with only the binary and ölibraries (probably, not confirmed) corrupt, I tried to rescue via using the most recent 12-CURRENT ISO FreeBSD USB image FreeBSD-12.0-CURRENT-amd64-20170105-r311461-memstick.img But I'm lost here! I mounted for convenience usr/obj and usr/src onto /usr/obj and /usr/src respectively onto the USB mounted filesystem. Everything else of the SSD is mounted onto /mnt. I thought I could simply "bootstrap" an installworld with the toolchain resident on /usr/obj, but I fail in a painful way. cd /usr/src, make DESTDIR/mnt installworld installkernel bugs out with some mysterious error telling me to set COMPILER_TYPE=, so I did set this variable to cc. The result: I figured out that the USB image is one of the useless minimalistic ones with no compiler aboard. Fine. No rescue, no cc, no nothing. I desperately need some advice in how I can perform installworld and installkernel. I have a customized /etc/src.conf and /etc/make.conf, so I guess I have to set ETCDIR=/mnt/etc also. Since I use a different name of my kernel (not GENERIC), I also need to set KERNCONF and KERNEL, so I guess, with KERNCONF in question, since I have already a kernel ready to install. But how can I delegate the installation procedure to use anything from /usr/obj including the compiler? Something has changed to the worse in FreeBSD! I remember that I had a similar situation a while ago last year on 10 or 11-CURRENT, where a crash destroyed libraries and I was capable of rescueing the system via the USB image and installworld. Either some great mind erased the necessary compiler from the (too) minimalistic image, or something new has been introduced to perform a rescue/standalone-bootstrap installation. Either way, I would be really happy if someone could give me a hint how to rescue the broken system. Thanks in advance, Oliver p.s. I've already written another mail to the list with a more unclear subject, I hope this subject makes it more clear and after the anger has gone away, I think I can express the situation more clearly. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On 01/17/17 22:28, Hans Petter Selasky wrote: + state->nextcall = SBT_MAX; + state->nextcallopt = now + 1; BTW: What locks are protecting the update of these fields? Can they be written simultaneously by configtimer() and cpu_new_callout()? --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On 01/17/17 20:46, Ian Lepore wrote: Does this matter for the first tick? How often is configtimer() called? > > As I said, it is called at runtime when profclock is started / stopped, not > just at boot. Those changes at runtime probably have existing callouts > active and your change will not process any callouts until the next hardclock > tick fires (but only because you are setting nextcallopt to the bogus > 'next' value). On some platforms, configtimer() can be called quite often. Power saving modes can change the frequency of the timer, and systems that suppport such dynamic frequency scaling call configtimer() (via cpu_et_frequency()) to handle the changes. Hi, I propose the following patch then: diff --git a/sys/kern/kern_clocksource.c b/sys/kern/kern_clocksource.c index 7f7769d..5ae925b 100644 --- a/sys/kern/kern_clocksource.c +++ b/sys/kern/kern_clocksource.c @@ -511,8 +511,13 @@ configtimer(int start) state->nexthard = next; state->nextstat = next; state->nextprof = next; - state->nextcall = next; - state->nextcallopt = next; + /* +* Force callout_process() to be called +* instantly, so that the correct value of +* "nextcall" can be computed: +*/ + state->nextcall = SBT_MAX; + state->nextcallopt = now + 1; hardclock_sync(cpu); } busy = 0; Then there is no problem having to wait for the next tick or anything, like John Baldwin pointed out. --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
In message , Hans Petter Sela sky writes: > Hi, > > When booting I observe an additional 30-second delay after this print: > > > Timecounters tick every 1.000 msec > > ~30 second delay and boot continues like normal. > > Checking "vmstat -i" reveals that some timers have been running loose. > > > cpu0:timer 44300442 > > cpu1:timer 40561404 > > cpu3:timer 48462822 483058 > > cpu2:timer 48477898 483209 > > Trying to add delays and/or prints around the Timecounters printout > makes the issue go away. Any ideas for debugging? > > Looks like a startup race to me. just picking a random email to reply to, I'm seeing a different issue with early AP startup. It affects one of my four machines, my laptop. My three server systems downstairs have no problem however my laptop will reboot repeatedly at: Jan 17 11:55:16 slippy kernel: cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed Then finally boot after a number of reboots (0-N), it finally boots. Disabling early AP start allows it to boot past that point first time. -- Cheers, Cy Schubert FreeBSD UNIX: Web: http://www.FreeBSD.org The need of the many outweighs the greed of the few. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
r312349 crashes: null filesizes: rescue with usb stick impossible due to lack of CC
FreeBSD crashes (r312349): during installworld and installkernel, the running system crashed and left me with a nullyfied Samsung 850 PRO SSD, means: almost every binary in /bin, /sbin, /usr/bin, /usr/sbin has filesize NULL, execpt /bin/sh. Booting is impossible. Since it is unknown what the crash triggered - I just tried to rescue my box from crashing by adding this faulty EARLY_AP_STARTUP option to my customized kernel - I intend to rescue the system by installing the binaries again via make installworld by using an USB image from the FreeBSD site: FreeBSD-12.0-CURRENT-amd64-20170105-r311461-memstick.img Booting this USB image, mounting all of the filesystems of the SSD of the trunkated system into /mnt (/mnt/usr, /mnt/usr/local etc) execpt /usr/src and /usr/obj, which Imount to the USB's /usr/src and /usr/ob, and then trying to perform the installworld again via cd /usr/src make DESTDIR=/mnt installkernel installworld I fail: I'm bothered with the lack of a compilerand enforced to set COMÜILER_TYPE, but this is useless, since this minimalistic and useless provided USB (and CD image) does obviusly not contain any compiler. So, I'm stranded! I found several websites on which exact such a rescue procedure is explained and I remember that I rescued the same way last year an 11-CURRENT system. At this moment, I can not fathom what kind of mind is behind of the reduction of the images and extracting essential parts. How am I supposed to rescue a system the way I try to? I have an intact /usr/obj, /usr/src, I have this crap USB image wihich is obviously incapable of performing such kind of rescue. The system in question does not have a DVD or CD drive, it is USB only. Any help appreciated. Please point me to the proper webpage on the FBSD site on which it might be to find how to rescue a system like I'm inclined to (if ever). I couldn't find any suitable notices. many thanks in advance, Oliver ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: r312348: igb broken: reporting wrong linkspeed!
> On Jan 17, 2017, at 11:54, Hartmann, O. wrote: > > 12-CURRENT (FreeBSD 12.0-CURRENT #74 r312348: Tue Jan 17 19:54:58 CET > 2017 am64) reports the wrong linkspeed on a dualport Intel i350 NIC: > > igb0: flags=8843 metric 0 mtu > 1500 > options=653dbb > ether xx:xx:xx:xx:xx:xx inet 192.168.0.111 netmask 0xff00 broadcast > 192.168.0.255 nd6 options=29 >media: Ethernet autoselect (100baseTX ) >status: active > > The swith the NIC is connected to reports 1 GBit. I checked with two > switches, FreeBSD reports bullshit on that subject. > > I also realised severe problems of this Intel i350 dual NIC cards with > FreeBSD (we use this NIC type as a standard and so we have plenty, all > with the same issue). When the NIC negotiates its linkspeed, it very > often fall back to 100 MBit. This behaviour is not predictable, but it > occurs with a SoHo smart managed Netgear GS110TBv2 and some of our > Cisco Catalyst switches at work (some 35XX and 29XX, I do not know the > exact type). Hi, One of the workarounds for igb wasn't ported to the new driver--I remember an issue like this being solved sometime in the 2015-2016 timeframe (I'm leaning towards 2016). Thanks, -Ngie ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
r312348: igb broken: reporting wrong linkspeed!
12-CURRENT (FreeBSD 12.0-CURRENT #74 r312348: Tue Jan 17 19:54:58 CET 2017 am64) reports the wrong linkspeed on a dualport Intel i350 NIC: igb0: flags=8843 metric 0 mtu 1500 options=653dbb ether xx:xx:xx:xx:xx:xx inet 192.168.0.111 netmask 0xff00 broadcast 192.168.0.255 nd6 options=29 media: Ethernet autoselect (100baseTX ) status: active The swith the NIC is connected to reports 1 GBit. I checked with two switches, FreeBSD reports bullshit on that subject. I also realised severe problems of this Intel i350 dual NIC cards with FreeBSD (we use this NIC type as a standard and so we have plenty, all with the same issue). When the NIC negotiates its linkspeed, it very often fall back to 100 MBit. This behaviour is not predictable, but it occurs with a SoHo smart managed Netgear GS110TBv2 and some of our Cisco Catalyst switches at work (some 35XX and 29XX, I do not know the exact type). Regards, oh ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Tue, 2017-01-17 at 11:00 -0800, John Baldwin wrote: > > > You could > > > do that by setting it to 'cc_firstevent' of the associated CPU, but in > > > practice 'state->nextcall' should already be set to that (it is > > initalized > > > to SBT_MAX in cpu_initclocks_bsp() and is then only set to other > > values due > > > to cpu_new_callout()). Keep in mind that configtimer() is not just > > called > > > from boot, but is also invoked when starting/stopping the profiling > > timer. > > > > > > > > However, when setting 'nextevent' (which is used to schedule the next > > timer > > > interrupt), we should be honoring the existing 'nextcall' if it is sooner > > > than the next hardclock. > > > > Does this matter for the first tick? How often is configtimer() called? > > As I said, it is called at runtime when profclock is started / stopped, not > just at boot. Those changes at runtime probably have existing callouts > active and your change will not process any callouts until the next hardclock > tick fires (but only because you are setting nextcallopt to the bogus > 'next' value). On some platforms, configtimer() can be called quite often. Power saving modes can change the frequency of the timer, and systems that suppport such dynamic frequency scaling call configtimer() (via cpu_et_frequency()) to handle the changes. -- Ian ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
Hi, On 01/17/17 20:00, John Baldwin wrote: Does this matter for the first tick? How often is configtimer() called? As I said, it is called at runtime when profclock is started / stopped, not just at boot. Those changes at runtime probably have existing callouts active and your change will not process any callouts until the next hardclock tick fires (but only because you are setting nextcallopt to the bogus 'next' value). > (One odd thing is that even in your case the first call to handleevents(), > the 'now => state->nextcallout' check in handleevents() should be true > which resets both nextcall and nextcallopt and invokes callout_process().) Let me take you through the failure path, by code inspection: I would really appreciate it if you could add traces to find out what actually happens rather than what seems to happen by looking at the code. :-/ The problem is that once you add some prints, the problem goes away. Maybe I should try to set hz to 100 or 25 ??? 0) cpu_initclocks_bsp() is called and init's nextcall and nexcallopt to SBT_MAX similar to your change. If no callout is scheduled before configtimer() then they remain set to SBT_MAX. Your current patch happens to trigger a (bogus) call to callout_process() on the first hardclock() because it sets nextcallopt to 'next' even though no callout is actually scheduled to fire at time 'next'. 1) configtimer() is called and we init nextcall and nextcallopt: > next = now + timerperiod; ... > state->nextcall = next; > state->nextcallopt = next; These both say "the next callout() should fire at 'next' which is the time of the next hardclock()", even though there may be no callouts scheduled (in which case both of these fields should be set to SBT_MAX from the call to cpu_initclocks_bsp(), or there may be callouts scheduled in which case 'nextcall' and 'nextcallopt' will reflect the time that those callouts are already scheduled for and this overwrites that). I see there are some callouts scheduled by SYSINITs, before the first configtimer(), like NFS_TIMERINIT in nfs_init(). These are setup using "dummy_timecounter" which means any nextcall values before the first configtimer should be discarded. 2) Any callout_reset() calls cpu_new_callout(): > */ > state->nextcallopt = bt_opt; > if (bt >= state->nextcall) We follow this path, because "bt" is surely based on sbinuptime() and is greater or equal to state->nextcall. Note that state->nextcallopt is updated to bt_opt, which is in the future. Note, my patch should _leave_ nextcall at SBT_MAX (from cpu_initclocks_bsp()) unless there was already an earlier call to callout_reset(). Yes, there are calls to callout_reset(). See for example NFS_TIMERINIT, like mentioned above. > IOW, it should be a NOP for the purposes of this branch compared with your change. (You could add a warning to print out if 'nextcall' != SBT_MAX during boot and see if it fires for example.) > goto done; > state->nextcall = bt; 3) getnextcpuevent(0) is called by the fast timercb() to setup the next event: > state = DPCPU_PTR(timerstate); > /* Handle hardclock() events, skipping some if CPU is idle. */ > event = state->nexthard; ... > /* Handle callout events. */ > if (event > state->nextcall) We then go looping into this path, because state->nextcall is still equal to "next" as in step 1) which is now in the past, until "now >= state->nextcallopt" inside handleevents(), which clears this condition. > event = state->nextcall; ... > return (event); I'm curious if there is a callout_reset() that has set 'nextcall' to a time that is effectively before 'now'. Maybe add a printf like this: Index: kern_clocksource.c === --- kern_clocksource.c (revision 312301) +++ kern_clocksource.c (working copy) @@ -498,12 +498,18 @@ configtimer(int start) CPU_FOREACH(cpu) { state = DPCPU_ID_PTR(cpu, timerstate); state->now = now; + printf("%s: CPU %d: now %jd nextcall %jd nextcallopt %jd next %jd\n", __func__, cpu, state->nextcall, state->nextcall, next); #ifndef EARLY_AP_STARTUP if (!smp_started && cpu != CPU_FIRST()) state->nextevent = SBT_MAX; else #endif In particular what I am worried about with your patch is that for post-boot calls to configtimer() you will delay any previously-scheduled callouts until the next hardclock. I understand. Would a solution be to refactor callout_process(), to accept the PCPU_GET(CPUID) as an argument and be executed for all CPUs by configtimer(), instead of trying to guess state->nextcall and state->nextcallopt in configtimer() ?
Re: Strange issue after early AP startup
On Tuesday, January 17, 2017 07:04:15 PM Hans Petter Selasky wrote: > On 01/17/17 16:50, John Baldwin wrote: > > On Monday, January 16, 2017 10:10:16 PM Hans Petter Selasky wrote: > >> On 01/16/17 20:31, John Baldwin wrote: > >>> On Monday, January 16, 2017 04:51:42 PM Hans Petter Selasky wrote: > Hi, > > When booting I observe an additional 30-second delay after this print: > > > Timecounters tick every 1.000 msec > > ~30 second delay and boot continues like normal. > > Checking "vmstat -i" reveals that some timers have been running loose. > > > cpu0:timer 44300442 > > cpu1:timer 40561404 > > cpu3:timer 48462822 483058 > > cpu2:timer 48477898 483209 > > Trying to add delays and/or prints around the Timecounters printout > makes the issue go away. Any ideas for debugging? > >>> > >>> I have generally used KTR tracing to trace what is happening during > >>> boot to debug EARLY_AP_STARTUP issues. > >>> > >> > >> Hi John, > >> > >> What happens is that getnextcpuevent(0) keeps on returning > >> "state->nextcall" which is in the past for CPU #2 and #3 on my box. > >> > >> In "cpu_new_callout()" there is a check if "bt >= state->nextcall", > >> which I suspect is true, so "state->nextcall" never gets set to real > >> minimum sbintime. > >> > >> The attached patch fixes the problem for me, but I'm not 100% sure > if it > >> is correct. > > > > Hi, > > > I think we want to be honoring any currently scheduled callouts. > > The problem here is that we might be changing the clocksource, then > sbinuptime() will change too, so I think the value should be reset by > configtimer() and then corrected at the next call to callout_process(). > > > You could > > do that by setting it to 'cc_firstevent' of the associated CPU, but in > > practice 'state->nextcall' should already be set to that (it is > initalized > > to SBT_MAX in cpu_initclocks_bsp() and is then only set to other > values due > > to cpu_new_callout()). Keep in mind that configtimer() is not just > called > > from boot, but is also invoked when starting/stopping the profiling > timer. > > > > > However, when setting 'nextevent' (which is used to schedule the next > timer > > interrupt), we should be honoring the existing 'nextcall' if it is sooner > > than the next hardclock. > > Does this matter for the first tick? How often is configtimer() called? As I said, it is called at runtime when profclock is started / stopped, not just at boot. Those changes at runtime probably have existing callouts active and your change will not process any callouts until the next hardclock tick fires (but only because you are setting nextcallopt to the bogus 'next' value). > > (One odd thing is that even in your case the first call to > handleevents(), > > the 'now => state->nextcallout' check in handleevents() should be true > > which resets both nextcall and nextcallopt and invokes > callout_process().) > > Let me take you through the failure path, by code inspection: I would really appreciate it if you could add traces to find out what actually happens rather than what seems to happen by looking at the code. :-/ 0) cpu_initclocks_bsp() is called and init's nextcall and nexcallopt to SBT_MAX similar to your change. If no callout is scheduled before configtimer() then they remain set to SBT_MAX. Your current patch happens to trigger a (bogus) call to callout_process() on the first hardclock() because it sets nextcallopt to 'next' even though no callout is actually scheduled to fire at time 'next'. > 1) configtimer() is called and we init nextcall and nextcallopt: > > > next = now + timerperiod; > ... > > state->nextcall = next; > > state->nextcallopt = next; These both say "the next callout() should fire at 'next' which is the time of the next hardclock()", even though there may be no callouts scheduled (in which case both of these fields should be set to SBT_MAX from the call to cpu_initclocks_bsp(), or there may be callouts scheduled in which case 'nextcall' and 'nextcallopt' will reflect the time that those callouts are already scheduled for and this overwrites that). > 2) Any callout_reset() calls cpu_new_callout(): > > > */ > > state->nextcallopt = bt_opt; > > if (bt >= state->nextcall) > We follow this path, because "bt" is surely based on sbinuptime() and is > greater or equal to state->nextcall. Note that state->nextcallopt is > updated to bt_opt, which is in the future. Note, my patch should _leave_ nextcall at SBT_MAX (from cpu_initclocks_bsp()) unless there was already an earlier call to callout_reset(). IOW, it should be a NOP for the purposes of this branch compared with your change. (
Re: NFS 4.1
Thanks for the reply Russell, I'm looking to set up a 4.1 server in order to host vmware images. I have set up an exports but I get an error stating NFS 4 is not supported when trying to attach it in VM storage. Is there any documentation for setting this up? Thanks Mike On Tue, Jan 17, 2017 at 10:02 AM, Russell L. Carter wrote: > On 01/17/17 10:38, Michael Ware wrote: > >> Good day, >> Does anyone know if NFS 4.1 (not 4.0) is available in FreeBSD 11? I have >> not been able to find any documentation around this. >> Thanks >> >> > Yes, though I'm not sure what specific feature you're looking for. > FreeBSD interoperates with my linux NFS 4.1 servers and clients > just fine. > > man nfsv4 > > $ cat ~/bin/knuth-mount > #! /bin/sh > > # man mount_nfs > MOUNT="mount_nfs -o nfsv4,minorversion=1" > #MOUNT="mount_nfs -o nfsv3" > > NFS_SERVER_HOST=knuth > $MOUNT $NFS_SERVER_HOST:/export/packages /mnt/$NFS_SERVER_HOST/packages > $MOUNT $NFS_SERVER_HOST:/usr/src /usr/src > $MOUNT $NFS_SERVER_HOST:/usr/obj /usr/obj > > HTH, > Russell > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > -- Michael Ware UCSC Baskin Engineering Unix, Network and Security 406-210-4725 ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: TSC as timecounter makes system lag
On Tue, Jan 17, 2017 at 10:05 PM, Hans Petter Selasky wrote: > I've seen something similar. Does the attached patch make any difference? > > Can you dump: > > vmstat -i > > Just after boot w/ and w/o the attached patch, when the keystroke did not > repeat smoothly. > > Your patch fixes this issue. It is now working as expected. vmstat output attached. Running w/ kernel r312210. Thanks, -Jia-Shiun. interrupt total rate ???0 0 irq1: atkbd0 2 0 stray irq1 0 0 irq0: attimer0 0 0 stray irq0 0 0 irq3: 0 0 stray irq3 0 0 irq4: uart00 0 stray irq4 0 0 irq5: 0 0 stray irq5 0 0 irq6: 0 0 stray irq6 0 0 irq7: 0 0 stray irq7 0 0 irq8: atrtc0 0 0 stray irq8 0 0 irq9: acpi00 0 stray irq9 0 0 irq10: 0 0 stray irq100 0 irq11: 0 0 stray irq110 0 irq12: 0 0 stray irq120 0 irq13: 0 0 stray irq130 0 irq14: 0 0 stray irq140 0 irq15: 0 0 stray irq150 0 irq16: em0:irq0++ 16 0 stray irq160 0 irq17: 0 0 stray irq170 0 irq18: uhci2 ehci0+ 18 0 stray irq180 0 irq19: uhci4 0 0 stray irq190 0 irq20: hpet0 28522442 stray irq200 0 irq21: uhci1 0 0 stray irq210 0 irq22: 0 0 stray irq220 0 irq23: uhci3 ehci1 0 0 stray irq230 0 cpu0:timer 0 0 irq256: hdac0110 2 stray irq256 0 0 irq257: pcib1 0 0 stray irq257 0 0 irq258: pcib2 0 0 stray irq258 0 0 irq259: pcib3 0 0 stray irq259 0 0 irq260: re0 6587102 stray irq260 0 0 irq261: ahci0:ch0 4062 63 stray irq261 0 0 irq262: ahci0:ch1 0 0 stray irq262 0 0 irq263: ahci0:ch2 0 0 stray irq263 0 0 irq264: ahci0:ch3 0 0 stray irq264 0 0 irq265: ahci0:ch4 0 0 stray irq265 0 0 irq266: ahci0:ch5 0 0 stray irq266 0 0 irq267: ahci0:60 0 stray irq267 0 0 irq268: ahci0:70 0 stray irq268 0 0 irq269: ahci0:80 0 stray irq269 0 0 irq270: ahci0:90 0 stray irq270 0 0 irq271: ahci0:10 0 0 stray irq271 0 0 irq272: ahci0:11 0 0 stray irq272 0 0 irq273: ahci0:12 0 0 stray irq273 0 0 irq274: ahci0:13 0 0 stray irq274 0 0 irq275: ahci0:14 0 0 stray irq275
Re: Strange issue after early AP startup
On 01/17/17 16:50, John Baldwin wrote: > On Monday, January 16, 2017 10:10:16 PM Hans Petter Selasky wrote: >> On 01/16/17 20:31, John Baldwin wrote: >>> On Monday, January 16, 2017 04:51:42 PM Hans Petter Selasky wrote: Hi, When booting I observe an additional 30-second delay after this print: > Timecounters tick every 1.000 msec ~30 second delay and boot continues like normal. Checking "vmstat -i" reveals that some timers have been running loose. > cpu0:timer 44300442 > cpu1:timer 40561404 > cpu3:timer 48462822 483058 > cpu2:timer 48477898 483209 Trying to add delays and/or prints around the Timecounters printout makes the issue go away. Any ideas for debugging? >>> >>> I have generally used KTR tracing to trace what is happening during >>> boot to debug EARLY_AP_STARTUP issues. >>> >> >> Hi John, >> >> What happens is that getnextcpuevent(0) keeps on returning >> "state->nextcall" which is in the past for CPU #2 and #3 on my box. >> >> In "cpu_new_callout()" there is a check if "bt >= state->nextcall", >> which I suspect is true, so "state->nextcall" never gets set to real >> minimum sbintime. >> >> The attached patch fixes the problem for me, but I'm not 100% sure if it >> is correct. > Hi, > I think we want to be honoring any currently scheduled callouts. The problem here is that we might be changing the clocksource, then sbinuptime() will change too, so I think the value should be reset by configtimer() and then corrected at the next call to callout_process(). > You could > do that by setting it to 'cc_firstevent' of the associated CPU, but in > practice 'state->nextcall' should already be set to that (it is initalized > to SBT_MAX in cpu_initclocks_bsp() and is then only set to other values due > to cpu_new_callout()). Keep in mind that configtimer() is not just called > from boot, but is also invoked when starting/stopping the profiling timer. > > However, when setting 'nextevent' (which is used to schedule the next timer > interrupt), we should be honoring the existing 'nextcall' if it is sooner > than the next hardclock. Does this matter for the first tick? How often is configtimer() called? > (One odd thing is that even in your case the first call to handleevents(), > the 'now => state->nextcallout' check in handleevents() should be true > which resets both nextcall and nextcallopt and invokes callout_process().) Let me take you through the failure path, by code inspection: 1) configtimer() is called and we init nextcall and nextcallopt: > next = now + timerperiod; ... > state->nextcall = next; > state->nextcallopt = next; 2) Any callout_reset() calls cpu_new_callout(): > */ > state->nextcallopt = bt_opt; > if (bt >= state->nextcall) We follow this path, because "bt" is surely based on sbinuptime() and is greater or equal to state->nextcall. Note that state->nextcallopt is updated to bt_opt, which is in the future. > goto done; > state->nextcall = bt; 3) getnextcpuevent(0) is called by the fast timercb() to setup the next event: > state = DPCPU_PTR(timerstate); > /* Handle hardclock() events, skipping some if CPU is idle. */ > event = state->nexthard; ... > /* Handle callout events. */ > if (event > state->nextcall) We then go looping into this path, because state->nextcall is still equal to "next" as in step 1) which is now in the past, until "now >= state->nextcallopt" inside handleevents(), which clears this condition. > event = state->nextcall; ... > return (event); --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: NFS 4.1
On 01/17/17 10:38, Michael Ware wrote: Good day, Does anyone know if NFS 4.1 (not 4.0) is available in FreeBSD 11? I have not been able to find any documentation around this. Thanks Yes, though I'm not sure what specific feature you're looking for. FreeBSD interoperates with my linux NFS 4.1 servers and clients just fine. man nfsv4 $ cat ~/bin/knuth-mount #! /bin/sh # man mount_nfs MOUNT="mount_nfs -o nfsv4,minorversion=1" #MOUNT="mount_nfs -o nfsv3" NFS_SERVER_HOST=knuth $MOUNT $NFS_SERVER_HOST:/export/packages /mnt/$NFS_SERVER_HOST/packages $MOUNT $NFS_SERVER_HOST:/usr/src /usr/src $MOUNT $NFS_SERVER_HOST:/usr/obj /usr/obj HTH, Russell ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On 01/17/17 16:50, John Baldwin wrote: Index: kern_clocksource.c === --- kern_clocksource.c (revision 312301) +++ kern_clocksource.c (working copy) @@ -503,7 +503,12 @@ configtimer(int start) state->nextevent = SBT_MAX; else #endif + if (next < state->nextcall) state->nextevent = next; + else if (state->nextcall < now) + state->nextevent = now; + else + state->nextevent = state->nextcall; if (periodic) state->nexttick = next; else @@ -511,8 +516,6 @@ configtimer(int start) state->nexthard = next; state->nextstat = next; state->nextprof = next; - state->nextcall = next; - state->nextcallopt = next; hardclock_sync(cpu); } busy = 0; This patch makes it worse. Now I don't even reach the login prompt. --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
NFS 4.1
Good day, Does anyone know if NFS 4.1 (not 4.0) is available in FreeBSD 11? I have not been able to find any documentation around this. Thanks -- Mike ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On 01/17/17 16:50, John Baldwin wrote: (One odd thing is that even in your case the first call to handleevents(), the 'now => state->nextcallout' check in handleevents() should be true which resets both nextcall and nextcallopt and invokes callout_process().) Hi, I suspect the cpu_new_callout() function is changing this condition after a callout_reset() call, before handleevents() gets a chance to run. I'll give your patch a spin right away and let you know how it goes. --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Strange issue after early AP startup
On Monday, January 16, 2017 10:10:16 PM Hans Petter Selasky wrote: > On 01/16/17 20:31, John Baldwin wrote: > > On Monday, January 16, 2017 04:51:42 PM Hans Petter Selasky wrote: > >> Hi, > >> > >> When booting I observe an additional 30-second delay after this print: > >> > >>> Timecounters tick every 1.000 msec > >> > >> ~30 second delay and boot continues like normal. > >> > >> Checking "vmstat -i" reveals that some timers have been running loose. > >> > >>> cpu0:timer 44300442 > >>> cpu1:timer 40561404 > >>> cpu3:timer 48462822 483058 > >>> cpu2:timer 48477898 483209 > >> > >> Trying to add delays and/or prints around the Timecounters printout > >> makes the issue go away. Any ideas for debugging? > > > > I have generally used KTR tracing to trace what is happening during > > boot to debug EARLY_AP_STARTUP issues. > > > > Hi John, > > What happens is that getnextcpuevent(0) keeps on returning > "state->nextcall" which is in the past for CPU #2 and #3 on my box. > > In "cpu_new_callout()" there is a check if "bt >= state->nextcall", > which I suspect is true, so "state->nextcall" never gets set to real > minimum sbintime. > > The attached patch fixes the problem for me, but I'm not 100% sure if it > is correct. I think we want to be honoring any currently scheduled callouts. You could do that by setting it to 'cc_firstevent' of the associated CPU, but in practice 'state->nextcall' should already be set to that (it is initalized to SBT_MAX in cpu_initclocks_bsp() and is then only set to other values due to cpu_new_callout()). Keep in mind that configtimer() is not just called from boot, but is also invoked when starting/stopping the profiling timer. However, when setting 'nextevent' (which is used to schedule the next timer interrupt), we should be honoring the existing 'nextcall' if it is sooner than the next hardclock. (One odd thing is that even in your case the first call to handleevents(), the 'now => state->nextcallout' check in handleevents() should be true which resets both nextcall and nextcallopt and invokes callout_process().) Here is a suggestion that attempts what I described in the first paragraph. If you still get hangs it would be good to break into DDB and capture the output of 'show clocksource'. Index: kern_clocksource.c === --- kern_clocksource.c (revision 312301) +++ kern_clocksource.c (working copy) @@ -503,7 +503,12 @@ configtimer(int start) state->nextevent = SBT_MAX; else #endif + if (next < state->nextcall) state->nextevent = next; + else if (state->nextcall < now) + state->nextevent = now; + else + state->nextevent = state->nextcall; if (periodic) state->nexttick = next; else @@ -511,8 +516,6 @@ configtimer(int start) state->nexthard = next; state->nextstat = next; state->nextprof = next; - state->nextcall = next; - state->nextcallopt = next; hardclock_sync(cpu); } busy = 0; -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Panic on boot current amd64
Am Tue, 17 Jan 2017 06:45:42 -0700 Sean Bruno schrieb: > On 01/17/17 02:10, O. Hartmann wrote: > > Am Mon, 16 Jan 2017 10:33:35 -0800 > > Manfred Antar schrieb: > > > >> From current today after changes to /sys/sys/gtaskqueue.h (r312293) I get > >> panic on > >> boot. reverting to r312235 boot ok > >> > >> random: harvesting attach, 8 bytes (4 bits) from uhub9 > >> ugen1.3: at usbus1 > >> kernel trap 12 with interrupts disabled > >> > >> > >> Fatal trap 12: page fault while in kernel mode > >> cpuid = 2; apic id = 02 > >> fault virtual address = 0x64 > >> fault code = supervisor read data, page not present > >> instruction pointer= 0x20:0x80660449 > >> stack pointer = 0x28:0xfe0466aa9010 > >> frame pointer = 0x28:0xfe0466aa9030 > >> code segment = base 0x0, limit 0xf, type 0x1b > >>= DPL 0, pres 1, long 1, def32 0, gran 1 > >> processor eflags = resume, IOPL = 0 > >> current process= 60445 (ifconfig) > >> [ thread pid 60445 tid 100131 ] > >> Stopped at grouptaskqueue_enqueue+0x19:cmpl$0,0x64(%rbx) > >> db> bt > >> Tracing pid 60445 tid 100131 td 0xf800088df500 > >> grouptaskqueue_enqueue() at grouptaskqueue_enqueue+0x19/frame > >> 0xfe0466aa9030 > >> em_intr() at em_intr+0x8c/frame 0xfe0466aa9060 > >> iflib_fast_intr() at iflib_fast_intr+0x2c/frame 0xfe0466aa9080 > >> intr_event_handle() at intr_event_handle+0x9b/frame 0xfe0466aa90d0 > >> intr_execute_handlers() at intr_execute_handlers+0x48/frame > >> 0xfe0466aa9100 > >> lapic_handle_intr() at lapic_handle_intr+0x3f/frame 0xfe0466aa9120 > >> Xapic_isr1() at Xapic_isr1+0xb7/frame 0xfe0466aa9120 > >> --- interrupt, rip = 0x809639ad, rsp = 0xfe0466aa91f0, rbp = > >> 0xfe0466aa9200 --- spinlock_exit() at spinlock_exit+0x2d/frame > >> 0xfe0466aa9200 > >> smp_rendezvous_cpus() at smp_rendezvous_cpus+0x272/frame 0xfe0466aa9270 > >> smp_rendezvous() at smp_rendezvous+0x40/frame 0xfe0466aa92a0 > >> counter_u64_alloc() at counter_u64_alloc+0x3e/frame 0xfe0466aa92c0 > >> rtentry_zinit() at rtentry_zinit+0x11/frame 0xfe0466aa92e0 > >> keg_alloc_slab() at keg_alloc_slab+0x1e3/frame 0xfe0466aa9350 > >> keg_fetch_slab() at keg_fetch_slab+0x16e/frame 0xfe0466aa93a0 > >> zone_fetch_slab() at zone_fetch_slab+0x9e/frame 0xfe0466aa93e0 > >> zone_import() at zone_import+0x52/frame 0xfe0466aa9430 > >> uma_zalloc_arg() at uma_zalloc_arg+0x450/frame 0xfe0466aa94a0 > >> rtrequest1_fib() at rtrequest1_fib+0xfc/frame 0xfe0466aa95c0 > >> rtinit() at rtinit+0x390/frame 0xfe0466aa9740 > >> in_addprefix() at in_addprefix+0xef/frame 0xfe0466aa97b0 > >> in_control() at in_control+0x9dc/frame 0xfe0466aa9850 > >> ifioctl() at ifioctl+0xdcc/frame 0xfe0466aa98d0 > >> kern_ioctl() at kern_ioctl+0x274/frame 0xfe0466aa9950 > >> sys_ioctl() at sys_ioctl+0x13c/frame 0xfe0466aa9a20 > >> amd64_syscall() at amd64_syscall+0x488/frame 0xfe0466aa9bb0 > >> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe0466aa9bb0 > >> --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x4b194a, rsp = > >> 0x7fffe478, > >> rbp = 0x7fffe4d0 --- > >> db> > >> > >> > >> > >> > >> ___ > >> freebsd-current@freebsd.org mailing list > >> https://lists.freebsd.org/mailman/listinfo/freebsd-current > >> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > >> > > > > > > Has this been fixed? I bugs mee, too. > > I went back to r312235, too, which doesn't coredump. > > > > Regards, > > > > oh > > > > For the time being, I'm suggesting people add EARLY_AP_START to their > kernel configs as is in GENERIC. > > We are still debugging this. > > sean > EARLY_AP_START was like a death-sentence in earlier days to my configs on two systems in my SoHo, so I avoided adding it. -- O. Hartmann Ich widerspreche der Nutzung oder Übermittlung meiner Daten für Werbezwecke oder für die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG). pgp6_T56kiISo.pgp Description: OpenPGP digital signature
Re: TSC as timecounter makes system lag
On 01/16/17 15:34, Jia-Shiun Li wrote: Yes. I noticed this because systat refreshes looked slower, and keystroke did not repeat smoothly for 30/s. I've seen something similar. Does the attached patch make any difference? Can you dump: vmstat -i Just after boot w/ and w/o the attached patch, when the keystroke did not repeat smoothly. --HPS diff --git a/sys/kern/kern_clocksource.c b/sys/kern/kern_clocksource.c index 7f7769d..454a130 100644 --- a/sys/kern/kern_clocksource.c +++ b/sys/kern/kern_clocksource.c @@ -511,7 +511,7 @@ configtimer(int start) state->nexthard = next; state->nextstat = next; state->nextprof = next; - state->nextcall = next; + state->nextcall = SBT_MAX; state->nextcallopt = next; hardclock_sync(cpu); } ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Panic on boot current amd64
On 01/17/17 02:10, O. Hartmann wrote: > Am Mon, 16 Jan 2017 10:33:35 -0800 > Manfred Antar schrieb: > >> From current today after changes to /sys/sys/gtaskqueue.h (r312293) I get >> panic on boot. >> reverting to r312235 boot ok >> >> random: harvesting attach, 8 bytes (4 bits) from uhub9 >> ugen1.3: at usbus1 >> kernel trap 12 with interrupts disabled >> >> >> Fatal trap 12: page fault while in kernel mode >> cpuid = 2; apic id = 02 >> fault virtual address= 0x64 >> fault code = supervisor read data, page not present >> instruction pointer = 0x20:0x80660449 >> stack pointer= 0x28:0xfe0466aa9010 >> frame pointer= 0x28:0xfe0466aa9030 >> code segment = base 0x0, limit 0xf, type 0x1b >> = DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags = resume, IOPL = 0 >> current process = 60445 (ifconfig) >> [ thread pid 60445 tid 100131 ] >> Stopped at grouptaskqueue_enqueue+0x19:cmpl$0,0x64(%rbx) >> db> bt >> Tracing pid 60445 tid 100131 td 0xf800088df500 >> grouptaskqueue_enqueue() at grouptaskqueue_enqueue+0x19/frame >> 0xfe0466aa9030 >> em_intr() at em_intr+0x8c/frame 0xfe0466aa9060 >> iflib_fast_intr() at iflib_fast_intr+0x2c/frame 0xfe0466aa9080 >> intr_event_handle() at intr_event_handle+0x9b/frame 0xfe0466aa90d0 >> intr_execute_handlers() at intr_execute_handlers+0x48/frame >> 0xfe0466aa9100 >> lapic_handle_intr() at lapic_handle_intr+0x3f/frame 0xfe0466aa9120 >> Xapic_isr1() at Xapic_isr1+0xb7/frame 0xfe0466aa9120 >> --- interrupt, rip = 0x809639ad, rsp = 0xfe0466aa91f0, rbp = >> 0xfe0466aa9200 --- spinlock_exit() at spinlock_exit+0x2d/frame >> 0xfe0466aa9200 >> smp_rendezvous_cpus() at smp_rendezvous_cpus+0x272/frame 0xfe0466aa9270 >> smp_rendezvous() at smp_rendezvous+0x40/frame 0xfe0466aa92a0 >> counter_u64_alloc() at counter_u64_alloc+0x3e/frame 0xfe0466aa92c0 >> rtentry_zinit() at rtentry_zinit+0x11/frame 0xfe0466aa92e0 >> keg_alloc_slab() at keg_alloc_slab+0x1e3/frame 0xfe0466aa9350 >> keg_fetch_slab() at keg_fetch_slab+0x16e/frame 0xfe0466aa93a0 >> zone_fetch_slab() at zone_fetch_slab+0x9e/frame 0xfe0466aa93e0 >> zone_import() at zone_import+0x52/frame 0xfe0466aa9430 >> uma_zalloc_arg() at uma_zalloc_arg+0x450/frame 0xfe0466aa94a0 >> rtrequest1_fib() at rtrequest1_fib+0xfc/frame 0xfe0466aa95c0 >> rtinit() at rtinit+0x390/frame 0xfe0466aa9740 >> in_addprefix() at in_addprefix+0xef/frame 0xfe0466aa97b0 >> in_control() at in_control+0x9dc/frame 0xfe0466aa9850 >> ifioctl() at ifioctl+0xdcc/frame 0xfe0466aa98d0 >> kern_ioctl() at kern_ioctl+0x274/frame 0xfe0466aa9950 >> sys_ioctl() at sys_ioctl+0x13c/frame 0xfe0466aa9a20 >> amd64_syscall() at amd64_syscall+0x488/frame 0xfe0466aa9bb0 >> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe0466aa9bb0 >> --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x4b194a, rsp = >> 0x7fffe478, rbp = >> 0x7fffe4d0 --- >> db> >> >> >> >> >> ___ >> freebsd-current@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > > > Has this been fixed? I bugs mee, too. > I went back to r312235, too, which doesn't coredump. > > Regards, > > oh > For the time being, I'm suggesting people add EARLY_AP_START to their kernel configs as is in GENERIC. We are still debugging this. sean signature.asc Description: OpenPGP digital signature
Re: r311568 makes freerdp very slow
On Fri, January 13, 2017 22:46, Jakob Alvermark wrote: > On Fri, January 13, 2017 19:44, John Baldwin wrote: > >> On Friday, January 13, 2017 09:58:01 AM Jakob Alvermark wrote: >> >> >>> On Thu, January 12, 2017 19:26, John Baldwin wrote: >>> >>> On Thursday, January 12, 2017 12:42:11 PM Shawn Webb wrote: > On Thu, Jan 12, 2017 at 06:05:08PM +0100, Jakob Alvermark wrote: > > > >> Hi, >> >> >> >> >> r311568 Set MORETOCOME for AIO write requests on a socket. >> >> After this commit freerdp is very slow. >> >> >> >> >> Before the password prompt would appear immediately when >> connecting to a server. Now it takes 5-10 seconds. After >> entering the password, another 5-10 seconds until I am >> connected. Once connected, there is a considerable lag. >> >> >> What could be the problem? >> >> >> > > I don't know what the problem is, but I am seeing the same > symptom. > > Can you get a ktrace of the freerdp process during this? The commit should only be setting MORETOCOME if multiple aio_write requests are queued to the same socket (so that TCP can batch them into a single packet). However, it should not affect an application just calling aio_write() on a socket once. -- John Baldwin >>> >>> Hi John, >>> >>> >>> >>> I got the ktrace, what do I do with it? >>> >>> >> >> kdump will generate a text representation, perhaps using 'kdump -s' to >> not include dumps of raw I/O data. If you can put the output of kdump >> at a URL I can fetch from then I can look at it. >> > > OK, here it is: http://filebin.ca/38mkuLau9Yqu/ktrace.out.xfreerdp.txt > > > Thanks, > > > Jakob Hi, Did you get any chance to look at this? Thanks, Jakob ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Panic on boot current amd64
Am Mon, 16 Jan 2017 10:33:35 -0800 Manfred Antar schrieb: > From current today after changes to /sys/sys/gtaskqueue.h (r312293) I get > panic on boot. > reverting to r312235 boot ok > > random: harvesting attach, 8 bytes (4 bits) from uhub9 > ugen1.3: at usbus1 > kernel trap 12 with interrupts disabled > > > Fatal trap 12: page fault while in kernel mode > cpuid = 2; apic id = 02 > fault virtual address = 0x64 > fault code= supervisor read data, page not present > instruction pointer = 0x20:0x80660449 > stack pointer = 0x28:0xfe0466aa9010 > frame pointer = 0x28:0xfe0466aa9030 > code segment = base 0x0, limit 0xf, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = resume, IOPL = 0 > current process = 60445 (ifconfig) > [ thread pid 60445 tid 100131 ] > Stopped at grouptaskqueue_enqueue+0x19:cmpl$0,0x64(%rbx) > db> bt > Tracing pid 60445 tid 100131 td 0xf800088df500 > grouptaskqueue_enqueue() at grouptaskqueue_enqueue+0x19/frame > 0xfe0466aa9030 > em_intr() at em_intr+0x8c/frame 0xfe0466aa9060 > iflib_fast_intr() at iflib_fast_intr+0x2c/frame 0xfe0466aa9080 > intr_event_handle() at intr_event_handle+0x9b/frame 0xfe0466aa90d0 > intr_execute_handlers() at intr_execute_handlers+0x48/frame 0xfe0466aa9100 > lapic_handle_intr() at lapic_handle_intr+0x3f/frame 0xfe0466aa9120 > Xapic_isr1() at Xapic_isr1+0xb7/frame 0xfe0466aa9120 > --- interrupt, rip = 0x809639ad, rsp = 0xfe0466aa91f0, rbp = > 0xfe0466aa9200 --- spinlock_exit() at spinlock_exit+0x2d/frame > 0xfe0466aa9200 > smp_rendezvous_cpus() at smp_rendezvous_cpus+0x272/frame 0xfe0466aa9270 > smp_rendezvous() at smp_rendezvous+0x40/frame 0xfe0466aa92a0 > counter_u64_alloc() at counter_u64_alloc+0x3e/frame 0xfe0466aa92c0 > rtentry_zinit() at rtentry_zinit+0x11/frame 0xfe0466aa92e0 > keg_alloc_slab() at keg_alloc_slab+0x1e3/frame 0xfe0466aa9350 > keg_fetch_slab() at keg_fetch_slab+0x16e/frame 0xfe0466aa93a0 > zone_fetch_slab() at zone_fetch_slab+0x9e/frame 0xfe0466aa93e0 > zone_import() at zone_import+0x52/frame 0xfe0466aa9430 > uma_zalloc_arg() at uma_zalloc_arg+0x450/frame 0xfe0466aa94a0 > rtrequest1_fib() at rtrequest1_fib+0xfc/frame 0xfe0466aa95c0 > rtinit() at rtinit+0x390/frame 0xfe0466aa9740 > in_addprefix() at in_addprefix+0xef/frame 0xfe0466aa97b0 > in_control() at in_control+0x9dc/frame 0xfe0466aa9850 > ifioctl() at ifioctl+0xdcc/frame 0xfe0466aa98d0 > kern_ioctl() at kern_ioctl+0x274/frame 0xfe0466aa9950 > sys_ioctl() at sys_ioctl+0x13c/frame 0xfe0466aa9a20 > amd64_syscall() at amd64_syscall+0x488/frame 0xfe0466aa9bb0 > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe0466aa9bb0 > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x4b194a, rsp = > 0x7fffe478, rbp = > 0x7fffe4d0 --- > db> > > > > > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" me, too here. Customized kernel. -- O. Hartmann Ich widerspreche der Nutzung oder Übermittlung meiner Daten für Werbezwecke oder für die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG). pgpfFyNXCLwbu.pgp Description: OpenPGP digital signature
Re: Panic on boot current amd64
Am Mon, 16 Jan 2017 10:33:35 -0800 Manfred Antar schrieb: > From current today after changes to /sys/sys/gtaskqueue.h (r312293) I get > panic on boot. > reverting to r312235 boot ok > > random: harvesting attach, 8 bytes (4 bits) from uhub9 > ugen1.3: at usbus1 > kernel trap 12 with interrupts disabled > > > Fatal trap 12: page fault while in kernel mode > cpuid = 2; apic id = 02 > fault virtual address = 0x64 > fault code= supervisor read data, page not present > instruction pointer = 0x20:0x80660449 > stack pointer = 0x28:0xfe0466aa9010 > frame pointer = 0x28:0xfe0466aa9030 > code segment = base 0x0, limit 0xf, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = resume, IOPL = 0 > current process = 60445 (ifconfig) > [ thread pid 60445 tid 100131 ] > Stopped at grouptaskqueue_enqueue+0x19:cmpl$0,0x64(%rbx) > db> bt > Tracing pid 60445 tid 100131 td 0xf800088df500 > grouptaskqueue_enqueue() at grouptaskqueue_enqueue+0x19/frame > 0xfe0466aa9030 > em_intr() at em_intr+0x8c/frame 0xfe0466aa9060 > iflib_fast_intr() at iflib_fast_intr+0x2c/frame 0xfe0466aa9080 > intr_event_handle() at intr_event_handle+0x9b/frame 0xfe0466aa90d0 > intr_execute_handlers() at intr_execute_handlers+0x48/frame 0xfe0466aa9100 > lapic_handle_intr() at lapic_handle_intr+0x3f/frame 0xfe0466aa9120 > Xapic_isr1() at Xapic_isr1+0xb7/frame 0xfe0466aa9120 > --- interrupt, rip = 0x809639ad, rsp = 0xfe0466aa91f0, rbp = > 0xfe0466aa9200 --- spinlock_exit() at spinlock_exit+0x2d/frame > 0xfe0466aa9200 > smp_rendezvous_cpus() at smp_rendezvous_cpus+0x272/frame 0xfe0466aa9270 > smp_rendezvous() at smp_rendezvous+0x40/frame 0xfe0466aa92a0 > counter_u64_alloc() at counter_u64_alloc+0x3e/frame 0xfe0466aa92c0 > rtentry_zinit() at rtentry_zinit+0x11/frame 0xfe0466aa92e0 > keg_alloc_slab() at keg_alloc_slab+0x1e3/frame 0xfe0466aa9350 > keg_fetch_slab() at keg_fetch_slab+0x16e/frame 0xfe0466aa93a0 > zone_fetch_slab() at zone_fetch_slab+0x9e/frame 0xfe0466aa93e0 > zone_import() at zone_import+0x52/frame 0xfe0466aa9430 > uma_zalloc_arg() at uma_zalloc_arg+0x450/frame 0xfe0466aa94a0 > rtrequest1_fib() at rtrequest1_fib+0xfc/frame 0xfe0466aa95c0 > rtinit() at rtinit+0x390/frame 0xfe0466aa9740 > in_addprefix() at in_addprefix+0xef/frame 0xfe0466aa97b0 > in_control() at in_control+0x9dc/frame 0xfe0466aa9850 > ifioctl() at ifioctl+0xdcc/frame 0xfe0466aa98d0 > kern_ioctl() at kern_ioctl+0x274/frame 0xfe0466aa9950 > sys_ioctl() at sys_ioctl+0x13c/frame 0xfe0466aa9a20 > amd64_syscall() at amd64_syscall+0x488/frame 0xfe0466aa9bb0 > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe0466aa9bb0 > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x4b194a, rsp = > 0x7fffe478, rbp = > 0x7fffe4d0 --- > db> > > > > > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" Has this been fixed? I bugs mee, too. I went back to r312235, too, which doesn't coredump. Regards, oh -- O. Hartmann Ich widerspreche der Nutzung oder Übermittlung meiner Daten für Werbezwecke oder für die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG). pgpivly7OFJu5.pgp Description: OpenPGP digital signature