buildworld is broken ?
Hi, subj head, amd64 Revision: 245588 protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmopcode.c cc -O2 -pipe -DACPI_ASL_COMPILER -I. -I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c cc1: warnings being treated as errors /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c: In function 'AcpiDmIsResourceTemplate': /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419: warning: dereferencing type-punned pointer will break strict-aliasing rules *** [dmresrc.o] Error code 1 Stop in /usr/src/usr.sbin/acpi/iasl. *** [all] Error code 1 Stop in /usr/src/usr.sbin/acpi. *** [all] Error code 1 Stop in /usr/src/usr.sbin. *** [usr.sbin.all__D] Error code 1 Stop in /usr/src. *** [everything] Error code 1 Stop in /usr/src. *** [buildworld] Error code 1 -- wbr, tiger ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic after r244584
On Fri, Jan 18, 2013 at 09:36:00AM +0200, Vitalij Satanivskij wrote: V Hello. V V After upgrading server from old hardware/software to freebsd current (## SVN ## Exported commit - http://svnweb.freebsd.org/changeset/base/245479), V system hung's with message - V panic: make_dev_alias_v: bad si_name (error=22 si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7) EINVAL (22) is caused by space character in the si_name: si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7 ^ I think Alexander (in Cc) has idea on why did that happen and how should that be fixed. -- Totus tuus, Glebius. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic after r244584
Alexander Motin wrote: AM On 18.01.2013 11:44, Gleb Smirnoff wrote: AM On Fri, Jan 18, 2013 at 09:36:00AM +0200, Vitalij Satanivskij wrote: AM V After upgrading server from old hardware/software to freebsd current (## SVN ## Exported commit - http://svnweb.freebsd.org/changeset/base/245479), AM V system hung's with message - AM V panic: make_dev_alias_v: bad si_name (error=22 si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7) AM AM EINVAL (22) is caused by space character in the si_name: AM AM si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7 AM AM I think Alexander (in Cc) has idea on why did that happen and how AM should that be fixed. AM AM The panic is triggered by the check added by the recent r244584 change. AM The space in device name came from the enclosure device, and I guess it AM may be quite often situation. Using human readable name supposed to help AM system administrators, but with spaces banned that may be a problem. AM That's was not created by human, it was generated (I think so) by system. May be problem not in r244584 at all but in incorect generation of the si_name ? More info drive (actualy drives, all 36 have same problem) inserted in backplane on supermicro chasis with LSI CORP SAS2X36 0417 on board. All of them attached to lsi sas 9211-4i controler in HBA mode. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic after r244584
On 18.01.2013 13:39, Vitalij Satanivskij wrote: Alexander Motin wrote: AM On 18.01.2013 11:44, Gleb Smirnoff wrote: AM On Fri, Jan 18, 2013 at 09:36:00AM +0200, Vitalij Satanivskij wrote: AM V After upgrading server from old hardware/software to freebsd current (## SVN ## Exported commit - http://svnweb.freebsd.org/changeset/base/245479), AM V system hung's with message - AM V panic: make_dev_alias_v: bad si_name (error=22 si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7) AM AM EINVAL (22) is caused by space character in the si_name: AM AM si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7 AM AM I think Alexander (in Cc) has idea on why did that happen and how AM should that be fixed. AM AM The panic is triggered by the check added by the recent r244584 change. AM The space in device name came from the enclosure device, and I guess it AM may be quite often situation. Using human readable name supposed to help AM system administrators, but with spaces banned that may be a problem. That's was not created by human, it was generated (I think so) by system. These strings are flashed into enclosure firmware by manufacturer. May be problem not in r244584 at all but in incorect generation of the si_name ? May be. But before r244584 it didn't cause panics and most of people were happy, except devctl consumers, who can't parse these events properly. More info drive (actualy drives, all 36 have same problem) inserted in backplane on supermicro chasis with LSI CORP SAS2X36 0417 on board. All of them attached to lsi sas 9211-4i controler in HBA mode. -- Alexander Motin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: buildworld is broken ?
On Fri, Jan 18, 2013 at 11:30:17AM +0300, Sergey V. Dyatko wrote: S subj S head, amd64 Revision: 245588 Works for me: Revision: 245593 Last Changed Rev: 245584 Last Changed Date: 2013-01-18 06:36:06 +0400 (пт, 18 янв 2013) Also, there is not tinderbox complaints on the mailing list. -- Totus tuus, Glebius. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic after r244584
On 18.01.2013 11:44, Gleb Smirnoff wrote: On Fri, Jan 18, 2013 at 09:36:00AM +0200, Vitalij Satanivskij wrote: V After upgrading server from old hardware/software to freebsd current (## SVN ## Exported commit - http://svnweb.freebsd.org/changeset/base/245479), V system hung's with message - V panic: make_dev_alias_v: bad si_name (error=22 si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7) EINVAL (22) is caused by space character in the si_name: si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7 I think Alexander (in Cc) has idea on why did that happen and how should that be fixed. The panic is triggered by the check added by the recent r244584 change. The space in device name came from the enclosure device, and I guess it may be quite often situation. Using human readable name supposed to help system administrators, but with spaces banned that may be a problem. -- Alexander Motin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
[head tinderbox] failure on ia64/ia64
TB --- 2013-01-18 10:27:26 - tinderbox 2.10 running on freebsd-current.sentex.ca TB --- 2013-01-18 10:27:26 - FreeBSD freebsd-current.sentex.ca 8.3-PRERELEASE FreeBSD 8.3-PRERELEASE #0: Mon Mar 26 13:54:12 EDT 2012 d...@freebsd-current.sentex.ca:/usr/obj/usr/src/sys/GENERIC amd64 TB --- 2013-01-18 10:27:26 - starting HEAD tinderbox run for ia64/ia64 TB --- 2013-01-18 10:27:26 - cleaning the object tree TB --- 2013-01-18 10:27:26 - /usr/local/bin/svn stat /src TB --- 2013-01-18 10:27:29 - At svn revision 245589 TB --- 2013-01-18 10:27:30 - building world TB --- 2013-01-18 10:27:30 - CROSS_BUILD_TESTING=YES TB --- 2013-01-18 10:27:30 - MAKEOBJDIRPREFIX=/obj TB --- 2013-01-18 10:27:30 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2013-01-18 10:27:30 - SRCCONF=/dev/null TB --- 2013-01-18 10:27:30 - TARGET=ia64 TB --- 2013-01-18 10:27:30 - TARGET_ARCH=ia64 TB --- 2013-01-18 10:27:30 - TZ=UTC TB --- 2013-01-18 10:27:30 - __MAKE_CONF=/dev/null TB --- 2013-01-18 10:27:30 - cd /src TB --- 2013-01-18 10:27:30 - /usr/bin/make -B buildworld Building an up-to-date make(1) World build started on Fri Jan 18 10:27:34 UTC 2013 Rebuilding the temporary build tree stage 1.1: legacy release compatibility shims stage 1.2: bootstrap tools stage 2.1: cleaning up the object tree stage 2.2: rebuilding the object tree stage 2.3: build tools stage 3: cross tools stage 4.1: building includes stage 4.2: building libraries stage 4.3: make dependencies stage 4.4: building everything [...] cc -O2 -pipe -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmbuffer.c cc -O2 -pipe -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmdeferred.c cc -O2 -pipe -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmnames.c cc -O2 -pipe -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmopcode.c cc -O2 -pipe -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c cc1: warnings being treated as errors /src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c: In function 'AcpiDmIsResourceTemplate': /src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419: warning: dereferencing type-punned pointer will break strict-aliasing rules *** [dmresrc.o] Error code 1 Stop in /src/usr.sbin/acpi/iasl. *** [all] Error code 1 Stop in /src/usr.sbin/acpi. *** [all] Error code 1 Stop in /src/usr.sbin. *** [usr.sbin.all__D] Error code 1 Stop in /src. *** [everything] Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2013-01-18 11:55:37 - WARNING: /usr/bin/make returned exit code 1 TB --- 2013-01-18 11:55:37 - ERROR: failed to build world TB --- 2013-01-18 11:55:37 - 4029.16 user 947.42 system 5290.92 real http://tinderbox.freebsd.org/tinderbox-head-ss-build-HEAD-ia64-ia64.full ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: buildworld is broken ?
On Fri, 18 Jan 2013 15:47:13 +0400 Gleb Smirnoff gleb...@freebsd.org wrote: On Fri, Jan 18, 2013 at 11:30:17AM +0300, Sergey V. Dyatko wrote: S subj S head, amd64 Revision: 245588 Works for me: Revision: 245593 Last Changed Rev: 245584 Last Changed Date: 2013-01-18 06:36:06 +0400 (пт, 18 янв 2013) strange :( laptop# cd /usr/src/usr.sbin/acpi/iasl laptop# make cc -O2 -pipe -DACPI_ASL_COMPILER -I. -I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c cc1: warnings being treated as errors /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c: In function 'AcpiDmIsResourceTemplate': /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419: warning: dereferencing type-punned pointer will break strict-aliasing rules *** [dmresrc.o] Error code 1 Stop in /usr/src/usr.sbin/acpi/iasl. Also, there is not tinderbox complaints on the mailing list. Subject: [head tinderbox] failure on ia64/ia64 Date: Fri, 18 Jan 2013 11:55:37 GMT -- wbr, tiger ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [RFC/RFT] calloutng
On Thu, 17 Jan 2013, Ian Lepore wrote: On Mon, 2013-01-14 at 11:38 +1100, Bruce Evans wrote: Er, timecounters are called with a spin mutex held in existing code: though it is dangerous to do so, timecounters are called from fast interrupt handlers for very timekeeping-critical purposes: - to implement the TIOCTIMESTAMP ioctl (except this is broken in -current). This was a primitive version of pps timestamping. - for pps timestamping. The interrupt handler (which should be a fast interrupt handler to minimize latency) calls pps_capture() which calls tc_get_timecount() and does other lock-free accesses to the timecounter state. This still works in -current (at least there is still code for it). Unfortunately, calling pps_capture() in the primary interrupt context is no longer an option with the stock pps driver. Ever since the ppbus rewrite all ppbus children must use threaded handlers. I tried to fix that a couple different ways, and both ended up with crazy-complex code Hmm, I didn't notice that ppc supported pps (I try not to look at it since it is ugly :-), and don't know of any version of it that uses non-threaded handlers (except in FreeBSD-4 before, where normal interrupt handlers were non-threaded, so ppc had their high latency but not the even higher latency and overheads of threaded handlers). OTOH, my x86 RTC interrupt handler is threaded and supports pps, and I haven't noticed any latency problems with this. It just can't possibly give the ~1 usec jitter that FreeBSD-[3-4] could give ~15 years ago using a fast interrupt handler (there must be only 1 device using a fast interrupt handler, with this dedicated to pps, else the multiple fast interrupt handlers will give latency much larger than ~1 usec to each other. I don't actually use this for anything except testing whether the RTC can be used for a poor man's pps. scattered around the ppbus family just to support the rarely-used pps capture. It would have been easier to do if filter and threaded interrupt handlers had the same function signature. I ended up writting a separate driver that can be used instead of ppc + ppbus + pps, since anyone who cares about precise pps capture is unlikely to be sharing the port with a printer or plip device or some such. Probably all pps handlers should be special. On x86 with reasonable timecounter hardware, say a TSC, it takes about 10 instructions for an entire pps interrupt handler: XintrN: pushl %eax pushl %edx rdtsc # Need some ugliness for EIO here or later. ss:movl %eax,ppscap # Hopefully lock-free via time-domain locking. ss:movl %edx,ppscap+4 popl%edx popl%eax iret After capturing the timecounter hardware value here, you convert it to a pps event at leisure. But since this only happens once per second, it wouldn't be very inefficient to turn the interrupt handler into a slow high-latency one, even a threaded one, to handle the pps event and/or other devices attached to the interrupt. OTOH, all drivers that call pps_capture() from their interrupt handler then immediately call pps_event(). This has always been very broken, and became even more broken with SMPng. pps_event() does many more timecounter and pps accesses whose locking is unclear at best, and in some configurations it calls hardpps(), which is only locked by Giant, despite comments in kern_ntptime.c still saying that it (and many other functions in kern_ntptime.c) must be called at splclock() or higher. splclock() is of course now null, but the locking requirements in kern_ntptime.c haven't changed much. kern_ntptime.c always needed to be locked by the equivalent of a spin mutex, which is stronger locking than was given by splclock(). pps_event() would have to aquire the spin mutex before calling hardpps(), although this is bad for fast interrupt handlers. The correct implementation is probably to only do the capture part from fast interrupt handlers. In my rewritten dedicated pps driver I call pps_capture() from the filter handler and pps_event() from the threaded handler. I never found That seems right. any good documentation on the low-level details of this stuff, and there isn't enough good example code to work from. My hazy memory is that I THere seem to be no good examples. ended up studying the pps_capture() and pps_event() code enough to infer that their design intent seems to be to allow you to capture with no locking and do the event processing later in some sort of deferred or threaded context. That seems to be the design, but there are no examples of separating the event from the capture. I think the correct locking is: - capture in a fast interrupt handler, into a per-device state that is locked by whatever locks all of the state accessed by the fast interrupt handler - switch to a less critical context later: - lock this step
Re: panic after r244584
On 2013-01-18, Alexander Motin wrote: AM V panic: make_dev_alias_v: bad si_name (error=22 si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7) AM The panic is triggered by the check added by the recent r244584 change. AM The space in device name came from the enclosure device, and I guess it AM may be quite often situation. Using human readable name supposed to help AM system administrators, but with spaces banned that may be a problem. That's was not created by human, it was generated (I think so) by system. These strings are flashed into enclosure firmware by manufacturer. You can't rely on that any string can be safely used as a device name even if spaces were allowed. Consider for example duplicate names and ../. Where these names are generated? The original report didn't contain a backtrace. -- Jaakko ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic after r244584
On 18.01.2013 15:19, Jaakko Heinonen wrote: On 2013-01-18, Alexander Motin wrote: AM V panic: make_dev_alias_v: bad si_name (error=22 si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7) AM The panic is triggered by the check added by the recent r244584 change. AM The space in device name came from the enclosure device, and I guess it AM may be quite often situation. Using human readable name supposed to help AM system administrators, but with spaces banned that may be a problem. That's was not created by human, it was generated (I think so) by system. These strings are flashed into enclosure firmware by manufacturer. You can't rely on that any string can be safely used as a device name even if spaces were allowed. Consider for example duplicate names and ../. Where these names are generated? The original report didn't contain a backtrace. At cam/scsi/ses_set_physpath.c ses_set_physpath(). Duplicate names are impossible there, as previous name components are unique. Special characters haven't yet seen, but I think theoretically possible. Interesting what Solaris does in such cases, mangles them somehow or removes completely? -- Alexander Motin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic after r244584
Jaakko Heinonen wrote: JH On 2013-01-18, Alexander Motin wrote: JH AM V panic: make_dev_alias_v: bad si_name (error=22 si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7) JH JH AM The panic is triggered by the check added by the recent r244584 change. JH AM The space in device name came from the enclosure device, and I guess it JH AM may be quite often situation. Using human readable name supposed to help JH AM system administrators, but with spaces banned that may be a problem. JH JH That's was not created by human, it was generated (I think so) by system. JH JH These strings are flashed into enclosure firmware by manufacturer. JH JH You can't rely on that any string can be safely used as a device name JH even if spaces were allowed. Consider for example duplicate names and JH ../. JH JH Where these names are generated? The original report didn't contain a JH backtrace. Yes. No backtrace, because of switching off all debuging in kernel. For now I can't use that's server for testing, but there are another servers waiting for upgrade. I will try to reproduce problem with kernel debuger enabled. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic after r244584
May be just do sanitizing for elmpriv-descr? something like change whitespace to _ or just delete it? Vitalij Satanivskij wrote: VS Jaakko Heinonen wrote: VS JH On 2013-01-18, Alexander Motin wrote: VS JH AM V panic: make_dev_alias_v: bad si_name (error=22 si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7) VS JH VS JH AM The panic is triggered by the check added by the recent r244584 change. VS JH AM The space in device name came from the enclosure device, and I guess it VS JH AM may be quite often situation. Using human readable name supposed to help VS JH AM system administrators, but with spaces banned that may be a problem. VS JH VS JH That's was not created by human, it was generated (I think so) by system. VS JH VS JH These strings are flashed into enclosure firmware by manufacturer. VS JH VS JH You can't rely on that any string can be safely used as a device name VS JH even if spaces were allowed. Consider for example duplicate names and VS JH ../. VS JH VS JH Where these names are generated? The original report didn't contain a VS JH backtrace. VS VS Yes. No backtrace, because of switching off all debuging in kernel. VS VS For now I can't use that's server for testing, but there are another servers waiting for upgrade. VS VS I will try to reproduce problem with kernel debuger enabled. VS VS VS ___ VS freebsd-current@freebsd.org mailing list VS http://lists.freebsd.org/mailman/listinfo/freebsd-current VS To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: My panic in amd64/pmap
on 17/01/2013 21:50 Larry Rosenman said the following: I've now seen this panic: pmap_insert_pt_page: pindex already inserted on 9.1-RELEASE, 9.1-STABLE, and 10.0-CURRENT I've got vmcore's from the 9.1-STABLE and 10.0-CURRENT VM's available as well as sources. I have the core.txt.* files available at: http://www.lerctr.org/~ler/core.txt.0 (10.0) http://www.lerctr.org/~ler/core.txt.2 (9.1-S) I'm not sure what other debug info you need. I can provide SSH access to both VM's as well as the host. These are all in VirtualBox 4.2.6 VM's Any help would be appreciated. Hmm, I wonder if VirtualBox is hitting the same popcnt bug that was fixed in qemu... Could you please try a patch from here http://thread.gmane.org/gmane.comp.emulators.qemu/174532/focus=174567 ? It should be applied to src/recompiler/target-i386/translate.c, make sure that it goes to a section marked as 'case 0x1b8: /* SSE4.2 popcnt */'. -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: My panic in amd64/pmap
On 2013-01-18 08:17, Andriy Gapon wrote: on 17/01/2013 21:50 Larry Rosenman said the following: I've now seen this panic: pmap_insert_pt_page: pindex already inserted on 9.1-RELEASE, 9.1-STABLE, and 10.0-CURRENT I've got vmcore's from the 9.1-STABLE and 10.0-CURRENT VM's available as well as sources. I have the core.txt.* files available at: http://www.lerctr.org/~ler/core.txt.0 (10.0) http://www.lerctr.org/~ler/core.txt.2 (9.1-S) I'm not sure what other debug info you need. I can provide SSH access to both VM's as well as the host. These are all in VirtualBox 4.2.6 VM's Any help would be appreciated. Hmm, I wonder if VirtualBox is hitting the same popcnt bug that was fixed in qemu... Could you please try a patch from here http://thread.gmane.org/gmane.comp.emulators.qemu/174532/focus=174567 ? It should be applied to src/recompiler/target-i386/translate.c, make sure that it goes to a section marked as 'case 0x1b8: /* SSE4.2 popcnt */'. Should this be on the host or the guest? -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 (c) E-Mail: l...@lerctr.org US Mail: 430 Valona Loop, Round Rock, TX 78681-3893 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic after r244584
On 18.01.2013 15:49, Vitalij Satanivskij wrote: May be just do sanitizing for elmpriv-descr? something like change whitespace to _ or just delete it? Yes, that is not difficult. The only question is how to stay consistent, compatible, user-readable. Vitalij Satanivskij wrote: VS Jaakko Heinonen wrote: VS JH On 2013-01-18, Alexander Motin wrote: VS JH AM V panic: make_dev_alias_v: bad si_name (error=22 si_name=enc@n5003048000bab37d/tpe0/slot@1/elmdesc@Slot 01/pass7) VS JH VS JH AM The panic is triggered by the check added by the recent r244584 change. VS JH AM The space in device name came from the enclosure device, and I guess it VS JH AM may be quite often situation. Using human readable name supposed to help VS JH AM system administrators, but with spaces banned that may be a problem. VS JH VS JH That's was not created by human, it was generated (I think so) by system. VS JH VS JH These strings are flashed into enclosure firmware by manufacturer. VS JH VS JH You can't rely on that any string can be safely used as a device name VS JH even if spaces were allowed. Consider for example duplicate names and VS JH ../. VS JH VS JH Where these names are generated? The original report didn't contain a VS JH backtrace. VS VS Yes. No backtrace, because of switching off all debuging in kernel. VS VS For now I can't use that's server for testing, but there are another servers waiting for upgrade. VS VS I will try to reproduce problem with kernel debuger enabled. VS VS VS ___ VS freebsd-current@freebsd.org mailing list VS http://lists.freebsd.org/mailman/listinfo/freebsd-current VS To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org -- Alexander Motin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic after r244584
Alexander Motin wrote: AM On 18.01.2013 15:49, Vitalij Satanivskij wrote: AM May be just do sanitizing for elmpriv-descr? AM AM something like change whitespace to _ or just delete it? AM AM Yes, that is not difficult. The only question is how to stay consistent, AM compatible, user-readable. AM Ok, now I have kernel dump kgdb /boot/kernel/kernel vmcore.0 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd... Unread portion of the kernel message buffer: da0 at mps0 bus 0 scbus7 target 8 lun 0 da0: ATA ST3500630NS G Fixed Direct Access SCSI-6 device da0: 300.000MB/s transfers da0: Command Queueing enabled da0: 476940MB (976773168 512 byte sectors: 255H 63S/T 60801C) panic: make_dev_alias_v: bad si_name (error=22, si_name=enc@n5003048000baa87d/type@0/slot@a/elmdesc@Slot 10/pass7) cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xff9b9ec84760 kdb_backtrace() at kdb_backtrace+0x39/frame 0xff9b9ec84810 vpanic() at vpanic+0x127/frame 0xff9b9ec84850 panic() at panic+0x43/frame 0xff9b9ec848b0 make_dev_alias_v() at make_dev_alias_v+0x1d0/frame 0xff9b9ec84900 make_dev_alias_p() at make_dev_alias_p+0x37/frame 0xff9b9ec84960 make_dev_physpath_alias() at make_dev_physpath_alias+0x14a/frame 0xff9b9ec849c0 pass_add_physpath() at pass_add_physpath+0xbd/frame 0xff9b9ec849f0 taskqueue_run_locked() at taskqueue_run_locked+0xf0/frame 0xff9b9ec84a40 taskqueue_thread_loop() at taskqueue_thread_loop+0x6c/frame 0xff9b9ec84a70 fork_exit() at fork_exit+0x84/frame 0xff9b9ec84ab0 fork_trampoline() at fork_trampoline+0xe/frame 0xff9b9ec84ab0 --- trap 0, rip = 0, rsp = 0xff9b9ec84b70, rbp = 0 --- KDB: enter: panic ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: panic after r244584
Vitalij Satanivskij wrote: VS Alexander Motin wrote: VS AM On 18.01.2013 15:49, Vitalij Satanivskij wrote: VS AM May be just do sanitizing for elmpriv-descr? VS AM VS AM something like change whitespace to _ or just delete it? VS AM VS AM Yes, that is not difficult. The only question is how to stay consistent, VS AM compatible, user-readable. VS AM VS VS Ok, now I have kernel dump VS VS kgdb /boot/kernel/kernel vmcore.0 VS GNU gdb 6.1.1 [FreeBSD] VS Copyright 2004 Free Software Foundation, Inc. VS GDB is free software, covered by the GNU General Public License, and you are VS welcome to change it and/or distribute copies of it under certain conditions. VS Type show copying to see the conditions. VS There is absolutely no warranty for GDB. Type show warranty for details. VS This GDB was configured as amd64-marcel-freebsd... VS VS Unread portion of the kernel message buffer: VS da0 at mps0 bus 0 scbus7 target 8 lun 0 VS da0: ATA ST3500630NS G Fixed Direct Access SCSI-6 device VS da0: 300.000MB/s transfers VS da0: Command Queueing enabled VS da0: 476940MB (976773168 512 byte sectors: 255H 63S/T 60801C) VS panic: make_dev_alias_v: bad si_name (error=22, si_name=enc@n5003048000baa87d/type@0/slot@a/elmdesc@Slot 10/pass7) VS cpuid = 0 VS KDB: stack backtrace: VS db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xff9b9ec84760 VS kdb_backtrace() at kdb_backtrace+0x39/frame 0xff9b9ec84810 VS vpanic() at vpanic+0x127/frame 0xff9b9ec84850 VS panic() at panic+0x43/frame 0xff9b9ec848b0 VS make_dev_alias_v() at make_dev_alias_v+0x1d0/frame 0xff9b9ec84900 VS make_dev_alias_p() at make_dev_alias_p+0x37/frame 0xff9b9ec84960 VS make_dev_physpath_alias() at make_dev_physpath_alias+0x14a/frame 0xff9b9ec849c0 VS pass_add_physpath() at pass_add_physpath+0xbd/frame 0xff9b9ec849f0 VS taskqueue_run_locked() at taskqueue_run_locked+0xf0/frame 0xff9b9ec84a40 VS taskqueue_thread_loop() at taskqueue_thread_loop+0x6c/frame 0xff9b9ec84a70 VS fork_exit() at fork_exit+0x84/frame 0xff9b9ec84ab0 VS fork_trampoline() at fork_trampoline+0xe/frame 0xff9b9ec84ab0 VS --- trap 0, rip = 0, rsp = 0xff9b9ec84b70, rbp = 0 --- VS KDB: enter: panic VS VS And of couse (kgdb) bt #0 doadump (textdump=0) at pcpu.h:229 #1 0x8034002e in db_dump (dummy=value optimized out, dummy2=0, dummy3=0, dummy4=0x0) at /usr/src/sys/ddb/db_command.c:543 #2 0x8033fada in db_command (last_cmdp=value optimized out, cmd_table=value optimized out, dopager=1) at /usr/src/sys/ddb/db_command.c:449 #3 0x8033f892 in db_command_loop () at /usr/src/sys/ddb/db_command.c:502 #4 0x80342240 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:231 #5 0x808b9753 in kdb_trap (type=3, code=0, tf=value optimized out) at /usr/src/sys/kern/subr_kdb.c:654 #6 0x80c0d3b8 in trap (frame=0xff9b9ec84740) at /usr/src/sys/amd64/amd64/trap.c:579 #7 0x80bf6512 in calltrap () at exception.S:228 #8 0x808b8f3e in kdb_enter (why=0x80e7adb1 panic, msg=value optimized out) at cpufunc.h:63 #9 0x80885a47 in vpanic (fmt=value optimized out, ap=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:746 #10 0x80885ab3 in panic (fmt=value optimized out) at /usr/src/sys/kern/kern_shutdown.c:682 #11 0x8083add0 in make_dev_alias_v (flags=value optimized out, cdev=0xfe0031b78cd0, pdev=value optimized out, fmt=value optimized out, ap=0xff9b9ec84940) at /usr/src/sys/kern/kern_conf.c:925 #12 0x8083ae27 in make_dev_alias_p (flags=-1631041792, cdev=0x80, pdev=0x80e72a0a, fmt=0x80 Address 0x80 out of bounds) at /usr/src/sys/kern/kern_conf.c:968 #13 0x8083af7a in make_dev_physpath_alias (flags=8, cdev=0xfe0031b78cd0, pdev=0xfe042bb8f000, old_alias=0x0, physpath=value optimized out) at /usr/src/sys/kern/kern_conf.c:1025 #14 0x80308b7d in pass_add_physpath (context=0xfe04fe563a00, pending=value optimized out) at /usr/src/sys/cam/scsi/scsi_pass.c:258 #15 0x808c8050 in taskqueue_run_locked (queue=0xfe002fddf800) at /usr/src/sys/kern/subr_taskqueue.c:312 #16 0x808c87ec in taskqueue_thread_loop (arg=value optimized out) at /usr/src/sys/kern/subr_taskqueue.c:501 #17 0x80855444 in fork_exit (callout=0x808c8780 taskqueue_thread_loop, arg=0x81502690, frame=0xff9b9ec84ac0) at /usr/src/sys/kern/kern_fork.c:991 #18 0x80bf6a4e in fork_trampoline () at exception.S:602 #19 0x in ?? () Current language: auto; currently minimal (kgdb) what next I can do to investigate problem? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: My panic in amd64/pmap
On 2013-01-18 09:09, Larry Rosenman wrote: On 2013-01-18 08:17, Andriy Gapon wrote: on 17/01/2013 21:50 Larry Rosenman said the following: I've now seen this panic: pmap_insert_pt_page: pindex already inserted on 9.1-RELEASE, 9.1-STABLE, and 10.0-CURRENT I've got vmcore's from the 9.1-STABLE and 10.0-CURRENT VM's available as well as sources. I have the core.txt.* files available at: http://www.lerctr.org/~ler/core.txt.0 (10.0) http://www.lerctr.org/~ler/core.txt.2 (9.1-S) I'm not sure what other debug info you need. I can provide SSH access to both VM's as well as the host. These are all in VirtualBox 4.2.6 VM's Any help would be appreciated. Hmm, I wonder if VirtualBox is hitting the same popcnt bug that was fixed in qemu... Could you please try a patch from here http://thread.gmane.org/gmane.comp.emulators.qemu/174532/focus=174567 ? It should be applied to src/recompiler/target-i386/translate.c, make sure that it goes to a section marked as 'case 0x1b8: /* SSE4.2 popcnt */'. Should this be on the host or the guest? Never mind, it's in VirtualBox itself. The line is at ~~line 8020 in the same file. I've patched it and am recompiling VirtualBox. If I don't see the panic for a few days, I'll submit a PR. -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 214-642-9640 (c) E-Mail: l...@lerctr.org US Mail: 430 Valona Loop, Round Rock, TX 78681-3893 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
kmem_map auto-sizing and size dependencies
The autotuning work is reaching into many places of the kernel and while trying to tie up all lose ends I've got stuck in the kmem_map and how it works or what its limitations are. During startup the VM is initialized and an initial kernel virtual memory map is setup in kmem_init() covering the entire KVM address range. Only the kernel itself is actually allocated within that map. A bit later on a number of other submaps are allocated (clean_map, buffer_map, pager_map, exec_map). Also in kmeminit() (in kern_malloc.c, different from kmem_init) the kmem_map is allocated. The (inital?) size of the kmem_map is determined by some voodoo magic, a sprinkle of nmbclusters * PAGE_SIZE incrementor and lots of tunables. However it seems to work out to an effective kmem_map_size of about 58MB on my 16GB AMD64 dev machine: vm.kvm_size: 549755809792 vm.kvm_free: 530233421824 vm.kmem_size: 16,594,300,928 vm.kmem_size_min: 0 vm.kmem_size_max: 329,853,485,875 vm.kmem_size_scale: 1 vm.kmem_map_size: 59,518,976 vm.kmem_map_free: 16,534,777,856 The kmem_map serves kernel malloc (via UMA), contigmalloc and everthing else that uses UMA for memory allocation. Mbuf memory too is managed by UMA which obtains the backing kernel memory from the kmem_map. The limits of the various mbuf memory types have been considerably raised recently and may make use of 50-75% of all physically present memory, or available KVM space, whichever is smaller. Now my questions/comments are: Does the kmem_map automatically extend itself if more memory is requested? Should it be set to a larger initial value based on min(physical,KVM) space available? The use of nmbclusters for the initial kmem_map size calculation isn't appropriate anymore due to it being set up later and nmbclusters isn't the only mbuf relevant mbuf type. We make significant use of page sized mbuf clusters too. The naming and output of the various vm.kmem_* and vm.kvm_* sysctls is confusing and not easy to reconcile. Either we need some more detailing more aspects or less. Plus perhaps sysctl subtrees to better describe the hierarchy of the maps. Why are separate kmem submaps being used? Is it to limit memory usage of certain subsystems? Are those limits actually enforced? -- Andre ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: kmem_map auto-sizing and size dependencies
I'll follow up with detailed answers to your questions over the weekend. For now, I will, however, point out that you've misinterpreted the tunables. In fact, they say that your kmem map can hold up to 16GB and the current used space is about 58MB. Like other things, the kmem map is auto-sized based on the available physical memory and capped so as not to consume too much of the overall kernel address space. Regards, Alan On Fri, Jan 18, 2013 at 9:29 AM, Andre Oppermann an...@freebsd.org wrote: The autotuning work is reaching into many places of the kernel and while trying to tie up all lose ends I've got stuck in the kmem_map and how it works or what its limitations are. During startup the VM is initialized and an initial kernel virtual memory map is setup in kmem_init() covering the entire KVM address range. Only the kernel itself is actually allocated within that map. A bit later on a number of other submaps are allocated (clean_map, buffer_map, pager_map, exec_map). Also in kmeminit() (in kern_malloc.c, different from kmem_init) the kmem_map is allocated. The (inital?) size of the kmem_map is determined by some voodoo magic, a sprinkle of nmbclusters * PAGE_SIZE incrementor and lots of tunables. However it seems to work out to an effective kmem_map_size of about 58MB on my 16GB AMD64 dev machine: vm.kvm_size: 549755809792 vm.kvm_free: 530233421824 vm.kmem_size: 16,594,300,928 vm.kmem_size_min: 0 vm.kmem_size_max: 329,853,485,875 vm.kmem_size_scale: 1 vm.kmem_map_size: 59,518,976 vm.kmem_map_free: 16,534,777,856 The kmem_map serves kernel malloc (via UMA), contigmalloc and everthing else that uses UMA for memory allocation. Mbuf memory too is managed by UMA which obtains the backing kernel memory from the kmem_map. The limits of the various mbuf memory types have been considerably raised recently and may make use of 50-75% of all physically present memory, or available KVM space, whichever is smaller. Now my questions/comments are: Does the kmem_map automatically extend itself if more memory is requested? Should it be set to a larger initial value based on min(physical,KVM) space available? The use of nmbclusters for the initial kmem_map size calculation isn't appropriate anymore due to it being set up later and nmbclusters isn't the only mbuf relevant mbuf type. We make significant use of page sized mbuf clusters too. The naming and output of the various vm.kmem_* and vm.kvm_* sysctls is confusing and not easy to reconcile. Either we need some more detailing more aspects or less. Plus perhaps sysctl subtrees to better describe the hierarchy of the maps. Why are separate kmem submaps being used? Is it to limit memory usage of certain subsystems? Are those limits actually enforced? -- Andre __**_ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/**mailman/listinfo/freebsd-**currenthttp://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscribe@** freebsd.org freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: kmem_map auto-sizing and size dependencies
On Fri, Jan 18, 2013 at 7:29 AM, Andre Oppermann an...@freebsd.org wrote: The (inital?) size of the kmem_map is determined by some voodoo magic, a sprinkle of nmbclusters * PAGE_SIZE incrementor and lots of tunables. However it seems to work out to an effective kmem_map_size of about 58MB on my 16GB AMD64 dev machine: vm.kvm_size: 549755809792 vm.kvm_free: 530233421824 vm.kmem_size: 16,594,300,928 vm.kmem_size_min: 0 vm.kmem_size_max: 329,853,485,875 vm.kmem_size_scale: 1 vm.kmem_map_size: 59,518,976 vm.kmem_map_free: 16,534,777,856 The kmem_map serves kernel malloc (via UMA), contigmalloc and everthing else that uses UMA for memory allocation. Mbuf memory too is managed by UMA which obtains the backing kernel memory from the kmem_map. The limits of the various mbuf memory types have been considerably raised recently and may make use of 50-75% of all physically present memory, or available KVM space, whichever is smaller. Now my questions/comments are: Does the kmem_map automatically extend itself if more memory is requested? Not that I recall. Should it be set to a larger initial value based on min(physical,KVM) space available? It needs to be smaller than the physical space, because the only limit on the kernel's use of (pinned) memory is the size of the map. So if it is too large there is nothing to stop the kernel from consuming all available memory. The lowmem handler is called when running out of virtual space only (i.e. a failure to allocate a range in the map). The naming and output of the various vm.kmem_* and vm.kvm_* sysctls is confusing and not easy to reconcile. Either we need some more detailing more aspects or less. Plus perhaps sysctl subtrees to better describe the hierarchy of the maps. Why are separate kmem submaps being used? Is it to limit memory usage of certain subsystems? Are those limits actually enforced? I mostly know about memguard, since I added memguard_fudge(). IIRC some of the submaps are used. The memguard_map specifically is used to know whether an allocation is guarded or not, so at free(9) it can be handled as normal malloc() or as memguard. Cheers, matthew ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
ULE can leak TDQ_LOCK() if statclock() called outside of critical_enter()
I have been experiencing occasional deadlocks on FreeBSD 8.2 systems using the ULE scheduler. The root cause in every case has been that ULE's TDQ_LOCK for cpu 0 is owned by a thread that is not running. I have been investigating the issue, and I believe that I see the issue. The problem occurs if the interrupt that drives statclock does not call critical_enter() upon calling into statclock(). The lapic timer does use critical_enter(), so default configurations would not see this. I have local patches to use the RTC to drive statclock, and from a quick reading of the eventtimer code in -CURRENT the same thing is possible there. The RTC code does not call statclock within a critical section. So here's the bug: 1) A thread with interrupts enabled, running on CPU 0, with td_owepreempt=1 and td_critnest=1 calls critical_exit(): void critical_exit(void) { // ... if (td-td_critnest == 1) { td-td_critnest = 0; if (td-td_owepreempt !kdb_active) { // Irrelevant bits snipped 2) td_critnest is set to 0, and then the RTC interrupt fires. 3) rtcintr calls into statclock (8.2) or statclock_cnt(head) with td_critnest still 0 (on head it goes through the eventtimer code, but it ends up in statclock eventually). 4) statclock takes the thread_lock on curthread, then calls sched_clock(). sched_clock calls sched_balance(); static void sched_balance(void) { // snip... tdq = TDQ_SELF(); TDQ_UNLOCK(tdq); sched_balance_group(cpu_top); TDQ_LOCK(tdq); } TDQ_UNLOCK does a spinlock_exit which does a critical_exit. td_critnest will be decremented back to 0 and td_owepreempt is still 1, so this triggers a preemption. Note that this TDQ_UNLOCK is (intentionally) unlocking the thread_lock done by statclock. 5) thread migrates to any other CPU, call it CPU n. tdq is now stale. TDQ_LOCK takes the lock for CPU 0 (but really it's intending to re-take the thread_lock, although a thread_lock() here would be equally incorrect -- sched_balance's caller is going to be mucking around with the TDQ when sched_balance returns). 6) The thread returns to statclock. statclock does a thread_unlock(). The td_lock is TDQ_LOCK(n), which we don't hold. We mangle the stat of TDQ_LOCK(n) and leave TDQ_LOCK(0) held. The simplest solution would be to do a critical_enter() in sched_balance, although that would be superfluous in the normal case where the lapic timer is driving statclock. I'm not sure if there's other code in the eventtimers infrastructure that's assuming that a preemption or migration is impossible while handling an event. A quick look at kern_clocksource.c turns up worrying comments like Handle all events for specified time on this CPU and uses of curcpu, so there may well be other issues lurking here. It looks to me that the safest thing to do would be to push the critical_enter() into the eventtimer code or even all the way back to the interrupt handlers (mirroring what the lapic code already does). ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
-current broken in acpi/iasl
=== usr.sbin/acpi/iasl (all) cc -O2 -pipe -march=core2 -DACPI_ASL_COMPILER -I. -I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c cc1: warnings being treated as errors /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c: In function 'AcpiDmIsResourceTemplate': /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419: warning: dereferencing type-punned pointer will break strict-aliasing rules *** [dmresrc.o] Error code 1 Stop in /usr/src/usr.sbin/acpi/iasl. *** [all] Error code 1 -- http://ache.vniz.net/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: -current broken in acpi/iasl
On Fri, 18 Jan 2013 22:17:27 +0400 Andrey Chernov a...@freebsd.org wrote: === usr.sbin/acpi/iasl (all) cc -O2 -pipe -march=core2 -DACPI_ASL_COMPILER -I. -I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c cc1: warnings being treated as errors /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c: In function 'AcpiDmIsResourceTemplate': /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419: warning: dereferencing type-punned pointer will break strict-aliasing rules *** [dmresrc.o] Error code 1 Stop in /usr/src/usr.sbin/acpi/iasl. *** [all] Error code 1 according to rumors buildworld done successfully with the clang. But I didn't test it. Look at buildworld is broken ? thread -- wbr, tiger ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: -current broken in acpi/iasl
On Fri, Jan 18, 2013 at 09:34:26PM +0300, Sergey V. Dyatko wrote: On Fri, 18 Jan 2013 22:17:27 +0400 Andrey Chernov a...@freebsd.org wrote: === usr.sbin/acpi/iasl (all) cc -O2 -pipe -march=core2 -DACPI_ASL_COMPILER -I. -I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c cc1: warnings being treated as errors /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c: In function 'AcpiDmIsResourceTemplate': /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419: warning: dereferencing type-punned pointer will break strict-aliasing rules *** [dmresrc.o] Error code 1 Stop in /usr/src/usr.sbin/acpi/iasl. *** [all] Error code 1 according to rumors buildworld done successfully with the clang. But I didn't test it. Look at buildworld is broken ? thread ... My head (10.0-CURRENT) builds (on my laptop build machine) were uneventful; e.g.: FreeBSD g1-227.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT #796 r245584M/245600: Fri Jan 18 06:07:23 PST 2013 r...@g1-227.catwhisker.org:/usr/obj/usr/src/sys/CANARY i386 And yes, I use clang. Peace, david -- David H. Wolfskill da...@catwhisker.org Taliban: Evil men with guns afraid of truth from a 14-year old girl. See http://www.catwhisker.org/~david/publickey.gpg for my public key. pgpCfXU5Sqs2v.pgp Description: PGP signature
Re: -current broken in acpi/iasl
Sergey V. Dyatko wrote this message on Fri, Jan 18, 2013 at 21:34 +0300: On Fri, 18 Jan 2013 22:17:27 +0400 Andrey Chernov a...@freebsd.org wrote: === usr.sbin/acpi/iasl (all) cc -O2 -pipe -march=core2 -DACPI_ASL_COMPILER -I. -I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c cc1: warnings being treated as errors /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c: In function 'AcpiDmIsResourceTemplate': /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419: warning: dereferencing type-punned pointer will break strict-aliasing rules *** [dmresrc.o] Error code 1 Stop in /usr/src/usr.sbin/acpi/iasl. *** [all] Error code 1 according to rumors buildworld done successfully with the clang. But I didn't test it. Look at buildworld is broken ? thread Looks like this broken when jkim imported the latest ACPICA code base: https://svnweb.freebsd.org/base?view=revisionrevision=245582 I've forward the tinderbox failure to him, so hopefully he'll fix it shortly... -- John-Mark Gurney Voice: +1 415 225 5579 All that I will do, has been done, All that I have, has not. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: -current broken in acpi/iasl
On 18.01.2013 22:37, David Wolfskill wrote: On Fri, Jan 18, 2013 at 09:34:26PM +0300, Sergey V. Dyatko wrote: On Fri, 18 Jan 2013 22:17:27 +0400 Andrey Chernov a...@freebsd.org wrote: === usr.sbin/acpi/iasl (all) cc -O2 -pipe -march=core2 -DACPI_ASL_COMPILER -I. -I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c cc1: warnings being treated as errors /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c: In function 'AcpiDmIsResourceTemplate': /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419: warning: dereferencing type-punned pointer will break strict-aliasing rules *** [dmresrc.o] Error code 1 Stop in /usr/src/usr.sbin/acpi/iasl. *** [all] Error code 1 according to rumors buildworld done successfully with the clang. But I didn't test it. Look at buildworld is broken ? thread ... My head (10.0-CURRENT) builds (on my laptop build machine) were uneventful; e.g.: FreeBSD g1-227.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT #796 r245584M/245600: Fri Jan 18 06:07:23 PST 2013 r...@g1-227.catwhisker.org:/usr/obj/usr/src/sys/CANARY i386 And yes, I use clang. Peace, david I have no clang bloat, it happens with good old gcc. signature.asc Description: OpenPGP digital signature
make buildworld failures with NO_KERBEROS=
Recently make buildworld started failing for me: 8 === include/xlocale (installincludes) sh /usr/src/tools/install.sh -C -o root -g wheel -m 444 _ctype.h _inttypes.h _langinfo.h _locale.h _monetary.h _stdio.h _stdlib.h _string.h _time.h _wchar.h /usr/obj/usr/src/tmp/usr/include/xlocale === kerberos5 (includes) set -e; cd /usr/src/kerberos5; /usr/obj/usr/src/make.amd64/make buildincludes; /usr/obj/usr/src/make.amd64/make installincludes === kerberos5/doc (buildincludes) === kerberos5/lib (buildincludes) === kerberos5/lib/libasn1 (buildincludes) compile_et /usr/src/kerberos5/lib/libasn1/../../../crypto/heimdal/lib/asn1/asn1_err.et compile_et: No such file or directory *** [asn1_err.h] Error code 1 Stop in /usr/src/kerberos5/lib/libasn1. *** [buildincludes] Error code 1 Stop in /usr/src/kerberos5/lib. *** [buildincludes] Error code 1 Stop in /usr/src/kerberos5. *** [includes] Error code 1 Stop in /usr/src/kerberos5. *** [kerberos5.includes__D] Error code 1 Stop in /usr/src. *** [_includes] Error code 1 Stop in /usr/src. *** [buildworld] Error code 1 Stop in /usr/src. 8 I was still using the recently de-supported NO_KERBEROS= and changing it to WITHOUT_KERBEROS= got it working again. I'm still wondering if this is the expected behaviour, though. Shouldn't buildworld create a usable compile_et instead of relying on compile_et's existence in /usr/bin? Fabian signature.asc Description: PGP signature
Re: make buildworld failures with NO_KERBEROS=
On Fri, Jan 18, 2013 at 08:54:03PM +0100, Fabian Keil wrote: Recently make buildworld started failing for me: 8 === include/xlocale (installincludes) sh /usr/src/tools/install.sh -C -o root -g wheel -m 444 _ctype.h _inttypes.h _langinfo.h _locale.h _monetary.h _stdio.h _stdlib.h _string.h _time.h _wchar.h /usr/obj/usr/src/tmp/usr/include/xlocale === kerberos5 (includes) set -e; cd /usr/src/kerberos5; /usr/obj/usr/src/make.amd64/make buildincludes; /usr/obj/usr/src/make.amd64/make installincludes === kerberos5/doc (buildincludes) === kerberos5/lib (buildincludes) === kerberos5/lib/libasn1 (buildincludes) compile_et /usr/src/kerberos5/lib/libasn1/../../../crypto/heimdal/lib/asn1/asn1_err.et compile_et: No such file or directory *** [asn1_err.h] Error code 1 See the thread started here: http://lists.freebsd.org/pipermail/freebsd-current/2013-January/039083.html AFAICT, it is an ordering issue. usr.bin/compile_et needs to be a bootstrap tool, but it depends on some bits from kerberos5 and building kerberos5 needs compile_et. -- Steve ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: sysctl -a causes kernel trap 12
On Thu, Jan 10, 2013 at 4:25 PM, Xin Li delp...@delphij.net wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 To all: this became more and more hard to replicate lately. I've tried these options and the most important progress is that it's possible to get a crashdump when debug.debugger_on_panic=0 and I managed to get a backtrace which indicates the panic occur when trying to do mtx_lock(Giant) - __mtx_lock_sleep - turnstile_wait - propagate_priority, but after I've added some instruments to the surrounding code and enabled INVARIANT and/or WITNESS, it mysteriously went away. Reverting my instruments code and update to latest svn makes the issue disappear for one day. I've hit it again today but unfortunately didn't get a successful dump and after reboot I can't reproduce it again :( Still trying... Any updates Xin? I was actually hitting what I believe to be exactly the same issue as you on one of my systems, and, as you've seen, adding any extra debugging or diagnostics seemed to eliminate the issue. I was able to generate quite a few vmcores and still have these sitting around in my filesystem (along with the kernels that helped produce them). I can recreate this crash on my system by compiling the NVIDIA driver with clang at -01 and above. Although it's been noted that this issue has been seen in scenarios without an NIVIDIA driver in the mix, whatever is happening in the kernel to cause the panic is somehow triggered by this, at least on my system. -Brandon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: sysctl -a causes kernel trap 12
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 01/18/13 12:50, Brandon Gooch wrote: On Thu, Jan 10, 2013 at 4:25 PM, Xin Li delp...@delphij.net mailto:delp...@delphij.net wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 To all: this became more and more hard to replicate lately. I've tried these options and the most important progress is that it's possible to get a crashdump when debug.debugger_on_panic=0 and I managed to get a backtrace which indicates the panic occur when trying to do mtx_lock(Giant) - __mtx_lock_sleep - turnstile_wait - propagate_priority, but after I've added some instruments to the surrounding code and enabled INVARIANT and/or WITNESS, it mysteriously went away. Reverting my instruments code and update to latest svn makes the issue disappear for one day. I've hit it again today but unfortunately didn't get a successful dump and after reboot I can't reproduce it again :( Still trying... Any updates Xin? No, it mysteriously disappeared for now. According to my understanding to recent svn commits, I didn't see anybody committing something that fixes it but I can no longer panic my system, with or without debugging code :( I was actually hitting what I believe to be exactly the same issue as you on one of my systems, and, as you've seen, adding any extra debugging or diagnostics seemed to eliminate the issue. I was able to generate quite a few vmcores and still have these sitting around in my filesystem (along with the kernels that helped produce them). I can recreate this crash on my system by compiling the NVIDIA driver with clang at -01 and above. Although it's been noted that this issue has been seen in scenarios without an NIVIDIA driver in the mix, whatever is happening in the kernel to cause the panic is somehow triggered by this, at least on my system. I'm not sure if this is the same problem. Could you please try using gcc to compile the nVIdia driver and see if that fixes the problem? Cheers, - -- Xin LI delp...@delphij.nethttps://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -BEGIN PGP SIGNATURE- iQEcBAEBCgAGBQJQ+bcKAAoJEG80Jeu8UPuz5D8H/RFSmPv2nNqGmLCNZpElesN5 IYHWTNwxekFLC5M/jeYCLePLGEozBqOBzryrVr1xslvIJJf2w0NLCEIzyC+kdWy9 ksi+DihihuwqEp7BIieQi/HQkwhFKxm0SmovPYu8Al3rFFyazuMCHstuToWyT9sN OV8ZjyinFIyb8EPqm7V6Ziwi7A6sApHO5SlQXscqANrT03FrU4I8tseNzdDX9uwQ zzewf05rkcko771Vk7JI9Xwu7VHZ+eN4NbujBhuVhMWw+utZSJFOf67o11JZw9B0 aM1PCfZef2NM9OfAN40JTY4/Hjk6TSygJKu3mGd3R5tjcRywU0ypwPXgOsUxlVg= =3Kk8 -END PGP SIGNATURE- ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
[head tinderbox] failure on ia64/ia64
TB --- 2013-01-18 20:22:39 - tinderbox 2.10 running on freebsd-current.sentex.ca TB --- 2013-01-18 20:22:39 - FreeBSD freebsd-current.sentex.ca 8.3-PRERELEASE FreeBSD 8.3-PRERELEASE #0: Mon Mar 26 13:54:12 EDT 2012 d...@freebsd-current.sentex.ca:/usr/obj/usr/src/sys/GENERIC amd64 TB --- 2013-01-18 20:22:39 - starting HEAD tinderbox run for ia64/ia64 TB --- 2013-01-18 20:22:39 - cleaning the object tree TB --- 2013-01-18 20:23:50 - /usr/local/bin/svn stat /src TB --- 2013-01-18 20:23:54 - At svn revision 245609 TB --- 2013-01-18 20:23:55 - building world TB --- 2013-01-18 20:23:55 - CROSS_BUILD_TESTING=YES TB --- 2013-01-18 20:23:55 - MAKEOBJDIRPREFIX=/obj TB --- 2013-01-18 20:23:55 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2013-01-18 20:23:55 - SRCCONF=/dev/null TB --- 2013-01-18 20:23:55 - TARGET=ia64 TB --- 2013-01-18 20:23:55 - TARGET_ARCH=ia64 TB --- 2013-01-18 20:23:55 - TZ=UTC TB --- 2013-01-18 20:23:55 - __MAKE_CONF=/dev/null TB --- 2013-01-18 20:23:55 - cd /src TB --- 2013-01-18 20:23:55 - /usr/bin/make -B buildworld Building an up-to-date make(1) World build started on Fri Jan 18 20:23:59 UTC 2013 Rebuilding the temporary build tree stage 1.1: legacy release compatibility shims stage 1.2: bootstrap tools stage 2.1: cleaning up the object tree stage 2.2: rebuilding the object tree stage 2.3: build tools stage 3: cross tools stage 4.1: building includes stage 4.2: building libraries stage 4.3: make dependencies stage 4.4: building everything [...] cc -O2 -pipe -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmbuffer.c cc -O2 -pipe -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmdeferred.c cc -O2 -pipe -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmnames.c cc -O2 -pipe -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmopcode.c cc -O2 -pipe -DACPI_ASL_COMPILER -I. -I/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c cc1: warnings being treated as errors /src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c: In function 'AcpiDmIsResourceTemplate': /src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419: warning: dereferencing type-punned pointer will break strict-aliasing rules *** [dmresrc.o] Error code 1 Stop in /src/usr.sbin/acpi/iasl. *** [all] Error code 1 Stop in /src/usr.sbin/acpi. *** [all] Error code 1 Stop in /src/usr.sbin. *** [usr.sbin.all__D] Error code 1 Stop in /src. *** [everything] Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2013-01-18 21:51:26 - WARNING: /usr/bin/make returned exit code 1 TB --- 2013-01-18 21:51:26 - ERROR: failed to build world TB --- 2013-01-18 21:51:26 - 4028.99 user 952.67 system 5326.66 real http://tinderbox.freebsd.org/tinderbox-head-ss-build-HEAD-ia64-ia64.full ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: -current broken in acpi/iasl
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2013-01-18 13:39:01 -0500, John-Mark Gurney wrote: Sergey V. Dyatko wrote this message on Fri, Jan 18, 2013 at 21:34 +0300: On Fri, 18 Jan 2013 22:17:27 +0400 Andrey Chernov a...@freebsd.org wrote: === usr.sbin/acpi/iasl (all) cc -O2 -pipe -march=core2 -DACPI_ASL_COMPILER -I. -I/usr/src/usr.sbin/acpi/iasl/../../../sys -std=gnu99 -fstack-protector -Wsystem-headers -Werror -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c cc1: warnings being treated as errors /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c: In function 'AcpiDmIsResourceTemplate': /usr/src/usr.sbin/acpi/iasl/../../../sys/contrib/dev/acpica/components/disassembler/dmresrc.c:419: warning: dereferencing type-punned pointer will break strict-aliasing rules *** [dmresrc.o] Error code 1 Stop in /usr/src/usr.sbin/acpi/iasl. *** [all] Error code 1 according to rumors buildworld done successfully with the clang. But I didn't test it. Look at buildworld is broken ? thread Looks like this broken when jkim imported the latest ACPICA code base: https://svnweb.freebsd.org/base?view=revisionrevision=245582 I've forward the tinderbox failure to him, so hopefully he'll fix it shortly... It should be fixed now (r245636). Sorry for the breakage. Jung-uk Kim -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.19 (FreeBSD) iQEcBAEBAgAGBQJQ+esDAAoJECXpabHZMqHOypgIAILl0S2cvEdTQWXJ4PWase07 yKA+DPHYAUx09JHbnLfEeA+KLFUz2jnX7dYR9ohSMcsnkI1/AH/z8dkFc3NLPUQw TXh1edQyXaYr0WK+3sW81Tl5thka5VwjznoJj1r/Og8Nrx/xYUYCEtpPsjDU1hW0 8T897m6MqOSZokWs4dyOt1ZWoncGRTHgC5tCzjcmAuiOTIkZ7hdLNXKu1nm+cgcy LNEvJf/d1bz6UzQ9xxCxG+HttZhi4YL8uAAYMHZtydM+Zp5yZskajyNmDkThSMhu LrUohDfMLk84DkyoAfzojr90o8tk6TujfHR+osF3oj9NkDi6o6VK0AVs1yKPg5c= =poDO -END PGP SIGNATURE- ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: sysctl -a causes kernel trap 12
On Fri, Jan 18, 2013 at 2:56 PM, Xin Li delp...@delphij.net wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 01/18/13 12:50, Brandon Gooch wrote: On Thu, Jan 10, 2013 at 4:25 PM, Xin Li delp...@delphij.net mailto:delp...@delphij.net wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 To all: this became more and more hard to replicate lately. I've tried these options and the most important progress is that it's possible to get a crashdump when debug.debugger_on_panic=0 and I managed to get a backtrace which indicates the panic occur when trying to do mtx_lock(Giant) - __mtx_lock_sleep - turnstile_wait - propagate_priority, but after I've added some instruments to the surrounding code and enabled INVARIANT and/or WITNESS, it mysteriously went away. Reverting my instruments code and update to latest svn makes the issue disappear for one day. I've hit it again today but unfortunately didn't get a successful dump and after reboot I can't reproduce it again :( Still trying... Any updates Xin? No, it mysteriously disappeared for now. According to my understanding to recent svn commits, I didn't see anybody committing something that fixes it but I can no longer panic my system, with or without debugging code :( I was actually hitting what I believe to be exactly the same issue as you on one of my systems, and, as you've seen, adding any extra debugging or diagnostics seemed to eliminate the issue. I was able to generate quite a few vmcores and still have these sitting around in my filesystem (along with the kernels that helped produce them). I can recreate this crash on my system by compiling the NVIDIA driver with clang at -01 and above. Although it's been noted that this issue has been seen in scenarios without an NIVIDIA driver in the mix, whatever is happening in the kernel to cause the panic is somehow triggered by this, at least on my system. I'm not sure if this is the same problem. Could you please try using gcc to compile the nVIdia driver and see if that fixes the problem? Cheers, - -- Xin LI delp...@delphij.nethttps://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die Indeed, a gcc compiled NVIDIA module eliminates the issue, sorry if I hadn't mentioned this earlier. What was happening to me at first was that my system would just hang while booting. I was able to figure out that it was during /etc/rc.d/initrandom. I actually got to a point where I removed the call to sysctl -a from 'better_than_nothing()' in /etc/rc.d/initrandom to have a booting system. I finally had a situation where I could get a panic by adding SW_WATCHDOG to my kernel and running watchdogd(8). For me, this panic would come and go seemingly at random as well, and I couldn't fumble my way around in the debugger to learn much of anything when I first started seeing it. I just started a process of modularizing everything I could in my kernel config, then loading modules 1-by-1 and booting over-and-over until I finally found what appeared to be the problem, which was the NVIDIA module compiled with clang. Oh, another thing: at times it seemed as though it was the number of modules loaded, as I could get the hang with 41 modules loaded, but not 40 or 42?! I admit, when I was seeing that behavior, I hadn't eliminated the NVIDIA driver from my loaded modules. I need to revisit the panic situation to confirm this particular strangeness. Here's the last panic I had: Unread portion of the kernel message buffer: = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 1175 (sysctl) (kgdb) bt #0 doadump (textdump=1694704112) at pcpu.h:229 #1 0x802fab82 in db_fncall (dummy1=value optimized out, dummy2=value optimized out, dummy3=value optimized out, dummy4=value optimized out) at /usr/src/sys/ddb/db_command.c:578 #2 0x802fa85a in db_command (last_cmdp=value optimized out, cmd_table=value optimized out, dopager=1) at /usr/src/sys/ddb/db_command.c:449 #3 0x802fa612 in db_command_loop () at /usr/src/sys/ddb/db_command.c:502 #4 0x802fcf60 in db_trap (type=value optimized out, code=0) at /usr/src/sys/ddb/db_main.c:231 #5 0x804a7b93 in kdb_trap (type=12, code=0, tf=value optimized out) at /usr/src/sys/kern/subr_kdb.c:654 #6 0x807157c5 in trap_fatal (frame=0xff8865032670, eva=value optimized out) at /usr/src/sys/amd64/amd64/trap.c:867 #7 0x80715adb in trap_pfault (frame=0x0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:698 #8 0x8071529b in trap (frame=0xff8865032670) at /usr/src/sys/amd64/amd64/trap.c:463 #9 0x806ff382 in calltrap () at exception.S:228 #10 0x8047bd50 in sysctl_sysctl_next_ls (lsp=value optimized out, name=0xff8865032a80, namelen=value optimized out, next=0xff8865032898, len=0xff8865032904,