Re: svn commit: r313268 - head/sys/kern [through -r313271 for atomic_fcmpset use and later: fails on PowerMac G5 "Quad Core"; -r313266 works]
On Sat, Feb 25, 2017 at 09:58:39AM -0800, Mark Millard wrote: > Thus the PowerMac G5 so-called "Quad Core" is back to > -r313254 without your patches. (The "Quad Core" really has > two processors, each with 2 cores.) > Thanks a lot for testing. I'll have to think what to do with it, worst case I'll #ifdef changes with powerpc. -- Mateusz Guzik ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: svn commit: r313268 - head/sys/kern [through -r313271 for atomic_fcmpset use and later: fails on PowerMac G5 "Quad Core"; -r313266 works]
On 2017-Feb-25, at 5:49 AM, Mark Millardwrote: > On 2017-Feb-25, at 1:05 AM, Mark Millard wrote: > >> On 2017-Feb-24, at 11:46 PM, Mark Millard wrote: >> >>> On 2017-Feb-24, at 8:25 PM, Mark Millard wrote: >>> On 2017-Feb-24, at 4:23 PM, Mateusz Guzik wrote: > > On Tue, Feb 21, 2017 at 01:37:25AM -0800, Mark Millard wrote: >> [Back to the powerpc64 context.] >> >> On 2017-Feb-20, at 11:10 AM, Mateusz Guzik wrote: >> >>> On Sat, Feb 18, 2017 at 04:18:05AM -0800, Mark Millard wrote: [Note: I experiment with clang based powerpc64 builds, reporting problems that I find. Justin is familiar with this, as is Nathan.] I tried to update the PowerMac G5 (a so-called "Quad Core") that I have access to from head -r312761 to -r313864 and ended up with random panics and hang ups in fairly short order after booting. Some approximate bisecting for the kernel lead to: (sometimes getting part way into a buildkernel attempt for a different version before a failure happens) -r313266: works (just before use of atomic_fcmpset) vs. -r313271: fails (last of the "use atomic_fcmpset" check-ins) (I did not try -r313268 through -r313270 as the use was gradually added.) So I'm currently running a -r313864 world with a -r313266 kernel. No kernel that I tried that was from before -r313266 had the problems. Any kernel that I tried that was from after -r313271 had the problems. Of course I did not try them all in other direction. :) >>> >>> I found that spin mutexes were not properly handling this, fixed in >>> r313996. >>> >>> Locally I added a if (cpu_tick() % 2) return (0); snipped to amd64 >>> fcmpset to simulate failures. Everything works, while it would easily >>> fail without the patch. >>> >>> That said, I hope this concludes the 'missing check for not-reread value >>> of failed fcmpset' saga. >>> >>> -- >>> Mateusz Guzik >> >> -r313999 is an improvement for powerpc64: it boots and I can >> log in on the old PowerMac G5 so-called "Quad Core". >> >> But, e.g., buildworld buildkernel eventually hangs and later >> the powerpc64 panics for "spin lock held too long". >> > > Allright, play time is over. > > Can you please: > 1. verify r313254 is stable for you > 2. apply https://people.freebsd.org/~mjg/patches/complete-locks.diff and > https://people.freebsd.org/~mjg/.junk/ppc.diff on top of it and retry > the test? > > This is a workaround which effectively disables the powerpc-specific > primitive and makes it use a cmpset wrapper instead. I don't have the > hardware to test right now and my attempts to boot in qemu also failed. > > That said, does not look like there are general fcmpset bugs left and > the remaining issue seems powerpc-specific. > > If this works, I'll commit the workaround for the time being as in few > weeks I'd like to start merging the work back to stable/11. > > -- > Mateusz Guzik I've started a self-hosted powerpc64 -r313254 build based on running the -r313266 kernel. (The context sometimes do cross builds in is tied up with other things. -r313266 is what my prior bisection came up with as the last appearently-working kernel at the time.) So it will be a while before I have a -r313254 in place to try: the self-hosted build takes longer and so will not be installed for a while. To judge stability I'll probably have -e313254 build the patched update that you want me to test, initially doing a cleanworld. So that too will take a while. (The above wording presumes all goes well.) I'll let you know as I go along if I run into anything interesting. My builds are rebuilding both world and kernel since what turns into /usr/include/sys/* has changes in your patch. The builds are without MALLOC_PRODUCTION but are otherwise not debug builds. I've not seen anything indicating that anyone has been trying TARGET_ARCH=powerpc. I've been trying TARGET_ARCH=powerpc64 . While I do not have access to a true TARGET_ARCH=powerpc machine currently, such a build can be used on a PowerMac G5 so-called "Quad Core". So I could eventually build and try such on the one powerpc family machine that I currently have access to. clang 3.9.1 has a significant code generation problem for TARGET_ARCH=powerpc and so I'd have to use a gcc 4.2.1 based build for that sort
[no subject]
Mit freundlichen Grüßen David Becker Vossemer Straße 17 41812 Erkelenz Tel.: 01520 3916568 ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Lock order reversal [ newbie ] report [2nd one> more of ]
On Wed, 22 Feb 2017 21:06:32 -0600, Benjamin Kadukwrote: > Hi Jeffrey, > > Thank you for your enthusiasm in reporting these. > Unfortunately, it is very likely that these two are "well-known" and > believed to be harmless, so you have not discovered something > terribly exciting. > > An old and no-longer particularly maintained listing of these and > other LORs is at: http://sources.zabbadoz.net/freebsd/lor.html > > -Ben > > On Wed, Feb 22, 2017 at 06:20:08PM -0800, Jeffrey Bouquet wrote: > > This one at boot: > > #0 to #10 > > bufwait > > /usr/src/sys/kern/vfs_bio.c:3500 > > dirhash > > /usr/src/sys/ufs/ufs/ufs_dirhash.c:201 > > > > r313487 12.0-CURRENT Feb 13 2017 > > 1200020 FWIW > > both the above and the below reports... > > > > > > > > > > On Wed, 22 Feb 2017 15:37:21 -0800 (PST), "Jeffrey Bouquet" > > wrote: > > > > > #0 #16 follow: > > > jotted down : > > > > > > 1. ufs /usr/src/sys/kern/vfs_syscalls.c:3364 > > > 2. bufwait /usr/src/sys/ufs/ffs/ffs_vnops.c:280 > > > 3. ufs /usr/src/sys/kern/vfs_subr.c:2600 > > > > > > [ took roxterm out of the xinitrc, system stable seems more than > > > yesterday... too > > > early to tell, which is/was a 2nd issue... put in urxvt and st... based > > > on TOP memory... ] > > > ___ > > > freebsd-current@freebsd.org mailing list > > > https://lists.freebsd.org/mailman/listinfo/freebsd-current > > > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > > > > > > ___ > > freebsd-current@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/freebsd-current > > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" Found a few more in a debug custom kernel that has 9h 54m uptime as of now, using nextboot, so are maybe not important... just a toy maybe to examine if anyone has free time and/or has less than 9h uptime issues to remove a few lock order reversals, in, say, amd64 (i386 here) if they are common to both but would crash amd64 more reliably than the good uptime I have as of this morning... r318487 Feb 13 2017 12.0-CURRENT 1200020 Thought to email them only because different 'subsystem' messages occur during the boot verbose process... like 2016, 2017, 2016, 2015... Maybe just newbie stuff. Thanks... ignore the 'speaker' stuff in it... Jeff kernel log messages: +subsystem 100 + vm_mem_init(0)... done. + vm_page_init(0)... done. +subsystem 180 + sysctl_register_all(0)... done. + mallocinit(0)... done. + malloc_init(_ACPICA)... done. + malloc_init(_KBDMUX)... done. + malloc_init(_LED)... done. + malloc_init(_MALODEV)... done. + malloc_init(_MD)... done. + malloc_init(_MDSECT)... done. + malloc_init(_MFIBUF)... done. + malloc_init(_MPR)... done. + malloc_init(_MPRSAS)... done. + malloc_init(_MPRUSER)... done. + malloc_init(_MPT2)... done. + malloc_init(_MPSSAS)... done. + malloc_init(_MPSUSER)... done. + malloc_init(_ACPITASK)... done. + malloc_init(_MPTUSER)... done. + malloc_init(_MRSAS)... done. + malloc_init(_ACPISEM)... done. + malloc_init(_CAMCCBQ)... done. + malloc_init(_ACPIDEV)... done. + malloc_init(_MVS)... done. + malloc_init(_MWLDEV)... done. + malloc_init(_NETMAP)... done. + malloc_init(_PPBUSDEV)... done. + malloc_init(_PSTIOP)... done. + malloc_init(_PSTRAID)... done. + malloc_init(_PUC)... done. + malloc_init(_CAMSIM)... done. + malloc_init(_ENTROPY)... done. + malloc_init(_CAMXPT)... done. + malloc_init(_CAMDEV)... done. + malloc_init(_CAMCCB)... done. + malloc_init(_CAMPATH)... done. + malloc_init(_SIIS)... done. + malloc_init(_CAMPERIPH)... done. + malloc_init(_SNP)... done. + malloc_init(_ACPICMBAT)... done. + malloc_init(_AC97)... done. + malloc_init(_FEEDER)... done. + malloc_init(_MIXER)... done. + malloc_init(_MIDI)... done. + malloc_init(_TWA)... done. + malloc_init(_TWE)... done. + malloc_init(_TWS)... done. + malloc_init(_ACPIPERF)... done. + malloc_init(_ACPIPWR)... done. + malloc_init(_CAMSCHED)... done. + malloc_init(_CAMQ)... done. + malloc_init(_SCSICD)... done. + malloc_init(_UART)... done. + malloc_init(_AGP)... done. + malloc_init(_AHCI)... done. + malloc_init(_USB)... done. + malloc_init(_USBDEV)... done. + malloc_init(_SCSICH)... done. + malloc_init(_ATADA)... done. + malloc_init(_CAMDEVQ)... done. + malloc_init(_SCSIDA)... done. + malloc_init(_SCSILOW)... done. + malloc_init(_AMR)... done. + malloc_init(_ATA)... done. + malloc_init(_ATADMA)... done. + malloc_init(_ATAPCI)... done. + malloc_init(_ATHDEV)... done. +
Re: svn commit: r313268 - head/sys/kern [through -r313271 for atomic_fcmpset use and later: fails on PowerMac G5 "Quad Core"; -r313266 works]
On 2017-Feb-25, at 1:05 AM, Mark Millardwrote: > On 2017-Feb-24, at 11:46 PM, Mark Millard wrote: > >> On 2017-Feb-24, at 8:25 PM, Mark Millard wrote: >> >>> On 2017-Feb-24, at 4:23 PM, Mateusz Guzik wrote: On Tue, Feb 21, 2017 at 01:37:25AM -0800, Mark Millard wrote: > [Back to the powerpc64 context.] > > On 2017-Feb-20, at 11:10 AM, Mateusz Guzik wrote: > >> On Sat, Feb 18, 2017 at 04:18:05AM -0800, Mark Millard wrote: >>> [Note: I experiment with clang based powerpc64 builds, >>> reporting problems that I find. Justin is familiar >>> with this, as is Nathan.] >>> >>> I tried to update the PowerMac G5 (a so-called "Quad Core") >>> that I have access to from head -r312761 to -r313864 and >>> ended up with random panics and hang ups in fairly short >>> order after booting. >>> >>> Some approximate bisecting for the kernel lead to: >>> (sometimes getting part way into a buildkernel attempt >>> for a different version before a failure happens) >>> >>> -r313266: works (just before use of atomic_fcmpset) >>> vs. >>> -r313271: fails (last of the "use atomic_fcmpset" check-ins) >>> >>> (I did not try -r313268 through -r313270 as the use was >>> gradually added.) >>> >>> So I'm currently running a -r313864 world with a -r313266 >>> kernel. >>> >>> No kernel that I tried that was from before -r313266 had the >>> problems. >>> >>> Any kernel that I tried that was from after -r313271 had the >>> problems. >>> >>> Of course I did not try them all in other direction. :) >>> >> >> I found that spin mutexes were not properly handling this, fixed in >> r313996. >> >> Locally I added a if (cpu_tick() % 2) return (0); snipped to amd64 >> fcmpset to simulate failures. Everything works, while it would easily >> fail without the patch. >> >> That said, I hope this concludes the 'missing check for not-reread value >> of failed fcmpset' saga. >> >> -- >> Mateusz Guzik > > -r313999 is an improvement for powerpc64: it boots and I can > log in on the old PowerMac G5 so-called "Quad Core". > > But, e.g., buildworld buildkernel eventually hangs and later > the powerpc64 panics for "spin lock held too long". > Allright, play time is over. Can you please: 1. verify r313254 is stable for you 2. apply https://people.freebsd.org/~mjg/patches/complete-locks.diff and https://people.freebsd.org/~mjg/.junk/ppc.diff on top of it and retry the test? This is a workaround which effectively disables the powerpc-specific primitive and makes it use a cmpset wrapper instead. I don't have the hardware to test right now and my attempts to boot in qemu also failed. That said, does not look like there are general fcmpset bugs left and the remaining issue seems powerpc-specific. If this works, I'll commit the workaround for the time being as in few weeks I'd like to start merging the work back to stable/11. -- Mateusz Guzik >>> >>> I've started a self-hosted powerpc64 -r313254 build >>> based on running the -r313266 kernel. (The context >>> sometimes do cross builds in is tied up with other >>> things. -r313266 is what my prior bisection came up >>> with as the last appearently-working kernel at the >>> time.) >>> >>> So it will be a while before I have a -r313254 in >>> place to try: the self-hosted build takes longer >>> and so will not be installed for a while. >>> >>> To judge stability I'll probably have -e313254 build >>> the patched update that you want me to test, initially >>> doing a cleanworld. So that too will take a while. >>> >>> (The above wording presumes all goes well.) >>> >>> I'll let you know as I go along if I run into anything >>> interesting. >>> >>> >>> My builds are rebuilding both world and kernel since >>> what turns into /usr/include/sys/* has changes in your >>> patch. >>> >>> The builds are without MALLOC_PRODUCTION but are >>> otherwise not debug builds. >>> >>> >>> I've not seen anything indicating that anyone has >>> been trying TARGET_ARCH=powerpc. I've been trying >>> TARGET_ARCH=powerpc64 . >>> >>> While I do not have access to a true >>> TARGET_ARCH=powerpc machine currently, such a build >>> can be used on a PowerMac G5 so-called "Quad Core". >>> So I could eventually build and try such on the one >>> powerpc family machine that I currently have access >>> to. >>> >>> clang 3.9.1 has a significant code generation problem >>> for TARGET_ARCH=powerpc and so I'd have to use >>> a gcc 4.2.1 based build for that sort of experiment. >>> (There is no xtoolchain for 32-bit powerpc.) >>> >>> I use clang 3.9.1 or xtoolchain for >>> TARGET_ARCH=powerpc64 and have been using clang 3.9.1 >>> in recent times. My primary
Re: svn commit: r313268 - head/sys/kern [through -r313271 for atomic_fcmpset use and later: fails on PowerMac G5 "Quad Core"; -r313266 works]
On 2017-Feb-24, at 11:46 PM, Mark Millard wrote: > On 2017-Feb-24, at 8:25 PM, Mark Millard wrote: > >> On 2017-Feb-24, at 4:23 PM, Mateusz Guzik wrote: >>> >>> On Tue, Feb 21, 2017 at 01:37:25AM -0800, Mark Millard wrote: [Back to the powerpc64 context.] On 2017-Feb-20, at 11:10 AM, Mateusz Guzik wrote: > On Sat, Feb 18, 2017 at 04:18:05AM -0800, Mark Millard wrote: >> [Note: I experiment with clang based powerpc64 builds, >> reporting problems that I find. Justin is familiar >> with this, as is Nathan.] >> >> I tried to update the PowerMac G5 (a so-called "Quad Core") >> that I have access to from head -r312761 to -r313864 and >> ended up with random panics and hang ups in fairly short >> order after booting. >> >> Some approximate bisecting for the kernel lead to: >> (sometimes getting part way into a buildkernel attempt >> for a different version before a failure happens) >> >> -r313266: works (just before use of atomic_fcmpset) >> vs. >> -r313271: fails (last of the "use atomic_fcmpset" check-ins) >> >> (I did not try -r313268 through -r313270 as the use was >> gradually added.) >> >> So I'm currently running a -r313864 world with a -r313266 >> kernel. >> >> No kernel that I tried that was from before -r313266 had the >> problems. >> >> Any kernel that I tried that was from after -r313271 had the >> problems. >> >> Of course I did not try them all in other direction. :) >> > > I found that spin mutexes were not properly handling this, fixed in > r313996. > > Locally I added a if (cpu_tick() % 2) return (0); snipped to amd64 > fcmpset to simulate failures. Everything works, while it would easily > fail without the patch. > > That said, I hope this concludes the 'missing check for not-reread value > of failed fcmpset' saga. > > -- > Mateusz Guzik -r313999 is an improvement for powerpc64: it boots and I can log in on the old PowerMac G5 so-called "Quad Core". But, e.g., buildworld buildkernel eventually hangs and later the powerpc64 panics for "spin lock held too long". >>> >>> Allright, play time is over. >>> >>> Can you please: >>> 1. verify r313254 is stable for you >>> 2. apply https://people.freebsd.org/~mjg/patches/complete-locks.diff and >>> https://people.freebsd.org/~mjg/.junk/ppc.diff on top of it and retry >>> the test? >>> >>> This is a workaround which effectively disables the powerpc-specific >>> primitive and makes it use a cmpset wrapper instead. I don't have the >>> hardware to test right now and my attempts to boot in qemu also failed. >>> >>> That said, does not look like there are general fcmpset bugs left and >>> the remaining issue seems powerpc-specific. >>> >>> If this works, I'll commit the workaround for the time being as in few >>> weeks I'd like to start merging the work back to stable/11. >>> >>> -- >>> Mateusz Guzik >> >> I've started a self-hosted powerpc64 -r313254 build >> based on running the -r313266 kernel. (The context >> sometimes do cross builds in is tied up with other >> things. -r313266 is what my prior bisection came up >> with as the last appearently-working kernel at the >> time.) >> >> So it will be a while before I have a -r313254 in >> place to try: the self-hosted build takes longer >> and so will not be installed for a while. >> >> To judge stability I'll probably have -e313254 build >> the patched update that you want me to test, initially >> doing a cleanworld. So that too will take a while. >> >> (The above wording presumes all goes well.) >> >> I'll let you know as I go along if I run into anything >> interesting. >> >> >> My builds are rebuilding both world and kernel since >> what turns into /usr/include/sys/* has changes in your >> patch. >> >> The builds are without MALLOC_PRODUCTION but are >> otherwise not debug builds. >> >> >> I've not seen anything indicating that anyone has >> been trying TARGET_ARCH=powerpc. I've been trying >> TARGET_ARCH=powerpc64 . >> >> While I do not have access to a true >> TARGET_ARCH=powerpc machine currently, such a build >> can be used on a PowerMac G5 so-called "Quad Core". >> So I could eventually build and try such on the one >> powerpc family machine that I currently have access >> to. >> >> clang 3.9.1 has a significant code generation problem >> for TARGET_ARCH=powerpc and so I'd have to use >> a gcc 4.2.1 based build for that sort of experiment. >> (There is no xtoolchain for 32-bit powerpc.) >> >> I use clang 3.9.1 or xtoolchain for >> TARGET_ARCH=powerpc64 and have been using clang 3.9.1 >> in recent times. My primary powerpc family use has >> been to experiment with building based on the >> modern libc++ and reporting issues discovered in the >> attempts. This explains the clang/xtoolchain context. >> >> clang 3.9.1 has