Re: svn commit: r313268 - head/sys/kern [through -r313271 for atomic_fcmpset use and later: fails on PowerMac G5 "Quad Core"; -r313266 works]

2017-02-25 Thread Mateusz Guzik
On Sat, Feb 25, 2017 at 09:58:39AM -0800, Mark Millard wrote:
> Thus the PowerMac G5 so-called "Quad Core" is back to
> -r313254 without your patches. (The "Quad Core" really has
> two processors, each with 2 cores.)
> 


Thanks a lot for testing. I'll have to think what to do with it, worst
case I'll #ifdef changes with powerpc.

-- 
Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: svn commit: r313268 - head/sys/kern [through -r313271 for atomic_fcmpset use and later: fails on PowerMac G5 "Quad Core"; -r313266 works]

2017-02-25 Thread Mark Millard
On 2017-Feb-25, at 5:49 AM, Mark Millard  wrote:

> On 2017-Feb-25, at 1:05 AM, Mark Millard  wrote:
> 
>> On 2017-Feb-24, at 11:46 PM, Mark Millard  wrote:
>> 
>>> On 2017-Feb-24, at 8:25 PM, Mark Millard  wrote:
>>> 
 On 2017-Feb-24, at 4:23 PM, Mateusz Guzik  wrote:
> 
> On Tue, Feb 21, 2017 at 01:37:25AM -0800, Mark Millard wrote:
>> [Back to the powerpc64 context.]
>> 
>> On 2017-Feb-20, at 11:10 AM, Mateusz Guzik  wrote:
>> 
>>> On Sat, Feb 18, 2017 at 04:18:05AM -0800, Mark Millard wrote:
 [Note: I experiment with clang based powerpc64 builds,
 reporting problems that I find. Justin is familiar
 with this, as is Nathan.]
 
 I tried to update the PowerMac G5 (a so-called "Quad Core")
 that I have access to from head -r312761 to -r313864 and
 ended up with random panics and hang ups in fairly short
 order after booting.
 
 Some approximate bisecting for the kernel lead to:
 (sometimes getting part way into a buildkernel attempt
 for a different version before a failure happens)
 
 -r313266: works (just before use of atomic_fcmpset)
 vs.
 -r313271: fails (last of the "use atomic_fcmpset" check-ins)
 
 (I did not try -r313268 through -r313270 as the use was
 gradually added.)
 
 So I'm currently running a -r313864 world with a -r313266
 kernel.
 
 No kernel that I tried that was from before -r313266 had the
 problems.
 
 Any kernel that I tried that was from after -r313271 had the
 problems.
 
 Of course I did not try them all in other direction. :)
 
>>> 
>>> I found that spin mutexes were not properly handling this, fixed in
>>> r313996.
>>> 
>>> Locally I added a if (cpu_tick() % 2) return (0); snipped to amd64
>>> fcmpset to simulate failures. Everything works, while it would easily
>>> fail without the patch.
>>> 
>>> That said, I hope this concludes the 'missing check for not-reread value
>>> of failed fcmpset' saga.
>>> 
>>> -- 
>>> Mateusz Guzik 
>> 
>> -r313999 is an improvement for powerpc64: it boots and I can
>> log in on the old PowerMac G5 so-called "Quad Core".
>> 
>> But, e.g., buildworld buildkernel eventually hangs and later
>> the powerpc64 panics for "spin lock held too long".
>> 
> 
> Allright, play time is over.
> 
> Can you please:
> 1. verify r313254 is stable for you
> 2. apply https://people.freebsd.org/~mjg/patches/complete-locks.diff and
> https://people.freebsd.org/~mjg/.junk/ppc.diff on top of it and retry
> the test?
> 
> This is a workaround which effectively disables the powerpc-specific
> primitive and makes it use a cmpset wrapper instead. I don't have the
> hardware to test right now and my attempts to boot in qemu also failed.
> 
> That said, does not look like there are general fcmpset bugs left and
> the remaining issue seems powerpc-specific.
> 
> If this works, I'll commit the workaround for the time being as in few
> weeks I'd like to start merging the work back to stable/11.
> 
> -- 
> Mateusz Guzik 
 
 I've started a self-hosted powerpc64 -r313254 build
 based on running the -r313266 kernel. (The context 
 sometimes do cross builds in is tied up with other
 things. -r313266 is what my prior bisection came up
 with as the last appearently-working kernel at the
 time.)
 
 So it will be a while before I have a -r313254 in
 place to try: the self-hosted build takes longer
 and so will not be installed for a while.
 
 To judge stability I'll probably have -e313254 build
 the patched update that you want me to test, initially
 doing a cleanworld. So that too will take a while.
 
 (The above wording presumes all goes well.)
 
 I'll let you know as I go along if I run into anything
 interesting.
 
 
 My builds are rebuilding both world and kernel since
 what turns into /usr/include/sys/* has changes in your
 patch.
 
 The builds are without MALLOC_PRODUCTION but are
 otherwise not debug builds.
 
 
 I've not seen anything indicating that anyone has
 been trying TARGET_ARCH=powerpc. I've been trying
 TARGET_ARCH=powerpc64 .
 
 While I do not have access to a true
 TARGET_ARCH=powerpc machine currently, such a build
 can be used on a PowerMac G5 so-called "Quad Core".
 So I could eventually build and try such on the one
 powerpc family machine that I currently have access
 to.
 
 clang 3.9.1 has a significant code generation problem
 for TARGET_ARCH=powerpc and so I'd have to use
 a gcc 4.2.1 based build for that sort 

[no subject]

2017-02-25 Thread david becker
Mit freundlichen Grüßen
David Becker
Vossemer Straße 17
41812 Erkelenz
Tel.: 01520 3916568
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Lock order reversal [ newbie ] report [2nd one> more of ]

2017-02-25 Thread Jeffrey Bouquet


On Wed, 22 Feb 2017 21:06:32 -0600, Benjamin Kaduk  wrote:

> Hi Jeffrey,
> 
> Thank you for your enthusiasm in reporting these.
> Unfortunately, it is very likely that these two are "well-known" and
> believed to be harmless, so you have not discovered something
> terribly exciting.
> 
> An old and no-longer particularly maintained listing of these and
> other LORs is at: http://sources.zabbadoz.net/freebsd/lor.html
> 
> -Ben
> 
> On Wed, Feb 22, 2017 at 06:20:08PM -0800, Jeffrey Bouquet wrote:
> > This one at boot:
> > #0 to #10
> > bufwait
> > /usr/src/sys/kern/vfs_bio.c:3500
> > dirhash
> > /usr/src/sys/ufs/ufs/ufs_dirhash.c:201
> > 
> > r313487   12.0-CURRENT Feb 13 2017 
> > 1200020  FWIW 
> > both the above and the below reports...
> > 
> > 
> > 
> > 
> > On Wed, 22 Feb 2017 15:37:21 -0800 (PST), "Jeffrey Bouquet" 
> >  wrote:
> > 
> > > #0 #16 follow:
> > > jotted down :
> > > 
> > > 1. ufs /usr/src/sys/kern/vfs_syscalls.c:3364
> > > 2. bufwait /usr/src/sys/ufs/ffs/ffs_vnops.c:280
> > > 3. ufs /usr/src/sys/kern/vfs_subr.c:2600
> > > 
> > > [ took roxterm out of the xinitrc, system stable seems more than 
> > > yesterday...  too
> > > early to tell, which is/was a 2nd issue...  put in urxvt and st... based 
> > > on TOP memory... ] 
> > > ___
> > > freebsd-current@freebsd.org mailing list
> > > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> > 
> > 
> > ___
> > freebsd-current@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Found a few more in a debug custom kernel that has  9h 54m uptime as of now,
using nextboot, so are maybe not important... just a toy maybe to examine if
anyone has free time and/or has less than 9h uptime issues to remove a few
lock order reversals, in, say, amd64 (i386 here)  if they are common to both but
would crash amd64 more reliably than the good uptime I have as of this
morning...  r318487  Feb 13 2017 12.0-CURRENT 1200020

Thought to email them only because different 'subsystem' messages occur
during the boot verbose process... like 2016, 2017, 2016, 2015... 

Maybe just newbie stuff. Thanks...  ignore the 'speaker' stuff in it... 

Jeff  kernel log messages:
+subsystem 100
+   vm_mem_init(0)... done.
+   vm_page_init(0)... done.
+subsystem 180
+   sysctl_register_all(0)... done.
+   mallocinit(0)... done.
+   malloc_init(_ACPICA)... done.
+   malloc_init(_KBDMUX)... done.
+   malloc_init(_LED)... done.
+   malloc_init(_MALODEV)... done.
+   malloc_init(_MD)... done.
+   malloc_init(_MDSECT)... done.
+   malloc_init(_MFIBUF)... done.
+   malloc_init(_MPR)... done.
+   malloc_init(_MPRSAS)... done.
+   malloc_init(_MPRUSER)... done.
+   malloc_init(_MPT2)... done.
+   malloc_init(_MPSSAS)... done.
+   malloc_init(_MPSUSER)... done.
+   malloc_init(_ACPITASK)... done.
+   malloc_init(_MPTUSER)... done.
+   malloc_init(_MRSAS)... done.
+   malloc_init(_ACPISEM)... done.
+   malloc_init(_CAMCCBQ)... done.
+   malloc_init(_ACPIDEV)... done.
+   malloc_init(_MVS)... done.
+   malloc_init(_MWLDEV)... done.
+   malloc_init(_NETMAP)... done.
+   malloc_init(_PPBUSDEV)... done.
+   malloc_init(_PSTIOP)... done.
+   malloc_init(_PSTRAID)... done.
+   malloc_init(_PUC)... done.
+   malloc_init(_CAMSIM)... done.
+   malloc_init(_ENTROPY)... done.
+   malloc_init(_CAMXPT)... done.
+   malloc_init(_CAMDEV)... done.
+   malloc_init(_CAMCCB)... done.
+   malloc_init(_CAMPATH)... done.
+   malloc_init(_SIIS)... done.
+   malloc_init(_CAMPERIPH)... done.
+   malloc_init(_SNP)... done.
+   malloc_init(_ACPICMBAT)... done.
+   malloc_init(_AC97)... done.
+   malloc_init(_FEEDER)... done.
+   malloc_init(_MIXER)... done.
+   malloc_init(_MIDI)... done.
+   malloc_init(_TWA)... done.
+   malloc_init(_TWE)... done.
+   malloc_init(_TWS)... done.
+   malloc_init(_ACPIPERF)... done.
+   malloc_init(_ACPIPWR)... done.
+   malloc_init(_CAMSCHED)... done.
+   malloc_init(_CAMQ)... done.
+   malloc_init(_SCSICD)... done.
+   malloc_init(_UART)... done.
+   malloc_init(_AGP)... done.
+   malloc_init(_AHCI)... done.
+   malloc_init(_USB)... done.
+   malloc_init(_USBDEV)... done.
+   malloc_init(_SCSICH)... done.
+   malloc_init(_ATADA)... done.
+   malloc_init(_CAMDEVQ)... done.
+   malloc_init(_SCSIDA)... done.
+   malloc_init(_SCSILOW)... done.
+   malloc_init(_AMR)... done.
+   malloc_init(_ATA)... done.
+   malloc_init(_ATADMA)... done.
+   malloc_init(_ATAPCI)... done.
+   malloc_init(_ATHDEV)... done.
+   

Re: svn commit: r313268 - head/sys/kern [through -r313271 for atomic_fcmpset use and later: fails on PowerMac G5 "Quad Core"; -r313266 works]

2017-02-25 Thread Mark Millard
On 2017-Feb-25, at 1:05 AM, Mark Millard  wrote:

> On 2017-Feb-24, at 11:46 PM, Mark Millard  wrote:
> 
>> On 2017-Feb-24, at 8:25 PM, Mark Millard  wrote:
>> 
>>> On 2017-Feb-24, at 4:23 PM, Mateusz Guzik  wrote:
 
 On Tue, Feb 21, 2017 at 01:37:25AM -0800, Mark Millard wrote:
> [Back to the powerpc64 context.]
> 
> On 2017-Feb-20, at 11:10 AM, Mateusz Guzik  wrote:
> 
>> On Sat, Feb 18, 2017 at 04:18:05AM -0800, Mark Millard wrote:
>>> [Note: I experiment with clang based powerpc64 builds,
>>> reporting problems that I find. Justin is familiar
>>> with this, as is Nathan.]
>>> 
>>> I tried to update the PowerMac G5 (a so-called "Quad Core")
>>> that I have access to from head -r312761 to -r313864 and
>>> ended up with random panics and hang ups in fairly short
>>> order after booting.
>>> 
>>> Some approximate bisecting for the kernel lead to:
>>> (sometimes getting part way into a buildkernel attempt
>>> for a different version before a failure happens)
>>> 
>>> -r313266: works (just before use of atomic_fcmpset)
>>> vs.
>>> -r313271: fails (last of the "use atomic_fcmpset" check-ins)
>>> 
>>> (I did not try -r313268 through -r313270 as the use was
>>> gradually added.)
>>> 
>>> So I'm currently running a -r313864 world with a -r313266
>>> kernel.
>>> 
>>> No kernel that I tried that was from before -r313266 had the
>>> problems.
>>> 
>>> Any kernel that I tried that was from after -r313271 had the
>>> problems.
>>> 
>>> Of course I did not try them all in other direction. :)
>>> 
>> 
>> I found that spin mutexes were not properly handling this, fixed in
>> r313996.
>> 
>> Locally I added a if (cpu_tick() % 2) return (0); snipped to amd64
>> fcmpset to simulate failures. Everything works, while it would easily
>> fail without the patch.
>> 
>> That said, I hope this concludes the 'missing check for not-reread value
>> of failed fcmpset' saga.
>> 
>> -- 
>> Mateusz Guzik 
> 
> -r313999 is an improvement for powerpc64: it boots and I can
> log in on the old PowerMac G5 so-called "Quad Core".
> 
> But, e.g., buildworld buildkernel eventually hangs and later
> the powerpc64 panics for "spin lock held too long".
> 
 
 Allright, play time is over.
 
 Can you please:
 1. verify r313254 is stable for you
 2. apply https://people.freebsd.org/~mjg/patches/complete-locks.diff and
 https://people.freebsd.org/~mjg/.junk/ppc.diff on top of it and retry
 the test?
 
 This is a workaround which effectively disables the powerpc-specific
 primitive and makes it use a cmpset wrapper instead. I don't have the
 hardware to test right now and my attempts to boot in qemu also failed.
 
 That said, does not look like there are general fcmpset bugs left and
 the remaining issue seems powerpc-specific.
 
 If this works, I'll commit the workaround for the time being as in few
 weeks I'd like to start merging the work back to stable/11.
 
 -- 
 Mateusz Guzik 
>>> 
>>> I've started a self-hosted powerpc64 -r313254 build
>>> based on running the -r313266 kernel. (The context 
>>> sometimes do cross builds in is tied up with other
>>> things. -r313266 is what my prior bisection came up
>>> with as the last appearently-working kernel at the
>>> time.)
>>> 
>>> So it will be a while before I have a -r313254 in
>>> place to try: the self-hosted build takes longer
>>> and so will not be installed for a while.
>>> 
>>> To judge stability I'll probably have -e313254 build
>>> the patched update that you want me to test, initially
>>> doing a cleanworld. So that too will take a while.
>>> 
>>> (The above wording presumes all goes well.)
>>> 
>>> I'll let you know as I go along if I run into anything
>>> interesting.
>>> 
>>> 
>>> My builds are rebuilding both world and kernel since
>>> what turns into /usr/include/sys/* has changes in your
>>> patch.
>>> 
>>> The builds are without MALLOC_PRODUCTION but are
>>> otherwise not debug builds.
>>> 
>>> 
>>> I've not seen anything indicating that anyone has
>>> been trying TARGET_ARCH=powerpc. I've been trying
>>> TARGET_ARCH=powerpc64 .
>>> 
>>> While I do not have access to a true
>>> TARGET_ARCH=powerpc machine currently, such a build
>>> can be used on a PowerMac G5 so-called "Quad Core".
>>> So I could eventually build and try such on the one
>>> powerpc family machine that I currently have access
>>> to.
>>> 
>>> clang 3.9.1 has a significant code generation problem
>>> for TARGET_ARCH=powerpc and so I'd have to use
>>> a gcc 4.2.1 based build for that sort of experiment.
>>> (There is no xtoolchain for 32-bit powerpc.)
>>> 
>>> I use clang 3.9.1 or xtoolchain for
>>> TARGET_ARCH=powerpc64 and have been using clang 3.9.1
>>> in recent times. My primary 

Re: svn commit: r313268 - head/sys/kern [through -r313271 for atomic_fcmpset use and later: fails on PowerMac G5 "Quad Core"; -r313266 works]

2017-02-25 Thread Mark Millard
On 2017-Feb-24, at 11:46 PM, Mark Millard  wrote:

> On 2017-Feb-24, at 8:25 PM, Mark Millard  wrote:
> 
>> On 2017-Feb-24, at 4:23 PM, Mateusz Guzik  wrote:
>>> 
>>> On Tue, Feb 21, 2017 at 01:37:25AM -0800, Mark Millard wrote:
 [Back to the powerpc64 context.]
 
 On 2017-Feb-20, at 11:10 AM, Mateusz Guzik  wrote:
 
> On Sat, Feb 18, 2017 at 04:18:05AM -0800, Mark Millard wrote:
>> [Note: I experiment with clang based powerpc64 builds,
>> reporting problems that I find. Justin is familiar
>> with this, as is Nathan.]
>> 
>> I tried to update the PowerMac G5 (a so-called "Quad Core")
>> that I have access to from head -r312761 to -r313864 and
>> ended up with random panics and hang ups in fairly short
>> order after booting.
>> 
>> Some approximate bisecting for the kernel lead to:
>> (sometimes getting part way into a buildkernel attempt
>> for a different version before a failure happens)
>> 
>> -r313266: works (just before use of atomic_fcmpset)
>> vs.
>> -r313271: fails (last of the "use atomic_fcmpset" check-ins)
>> 
>> (I did not try -r313268 through -r313270 as the use was
>> gradually added.)
>> 
>> So I'm currently running a -r313864 world with a -r313266
>> kernel.
>> 
>> No kernel that I tried that was from before -r313266 had the
>> problems.
>> 
>> Any kernel that I tried that was from after -r313271 had the
>> problems.
>> 
>> Of course I did not try them all in other direction. :)
>> 
> 
> I found that spin mutexes were not properly handling this, fixed in
> r313996.
> 
> Locally I added a if (cpu_tick() % 2) return (0); snipped to amd64
> fcmpset to simulate failures. Everything works, while it would easily
> fail without the patch.
> 
> That said, I hope this concludes the 'missing check for not-reread value
> of failed fcmpset' saga.
> 
> -- 
> Mateusz Guzik 
 
 -r313999 is an improvement for powerpc64: it boots and I can
 log in on the old PowerMac G5 so-called "Quad Core".
 
 But, e.g., buildworld buildkernel eventually hangs and later
 the powerpc64 panics for "spin lock held too long".
 
>>> 
>>> Allright, play time is over.
>>> 
>>> Can you please:
>>> 1. verify r313254 is stable for you
>>> 2. apply https://people.freebsd.org/~mjg/patches/complete-locks.diff and
>>> https://people.freebsd.org/~mjg/.junk/ppc.diff on top of it and retry
>>> the test?
>>> 
>>> This is a workaround which effectively disables the powerpc-specific
>>> primitive and makes it use a cmpset wrapper instead. I don't have the
>>> hardware to test right now and my attempts to boot in qemu also failed.
>>> 
>>> That said, does not look like there are general fcmpset bugs left and
>>> the remaining issue seems powerpc-specific.
>>> 
>>> If this works, I'll commit the workaround for the time being as in few
>>> weeks I'd like to start merging the work back to stable/11.
>>> 
>>> -- 
>>> Mateusz Guzik 
>> 
>> I've started a self-hosted powerpc64 -r313254 build
>> based on running the -r313266 kernel. (The context 
>> sometimes do cross builds in is tied up with other
>> things. -r313266 is what my prior bisection came up
>> with as the last appearently-working kernel at the
>> time.)
>> 
>> So it will be a while before I have a -r313254 in
>> place to try: the self-hosted build takes longer
>> and so will not be installed for a while.
>> 
>> To judge stability I'll probably have -e313254 build
>> the patched update that you want me to test, initially
>> doing a cleanworld. So that too will take a while.
>> 
>> (The above wording presumes all goes well.)
>> 
>> I'll let you know as I go along if I run into anything
>> interesting.
>> 
>> 
>> My builds are rebuilding both world and kernel since
>> what turns into /usr/include/sys/* has changes in your
>> patch.
>> 
>> The builds are without MALLOC_PRODUCTION but are
>> otherwise not debug builds.
>> 
>> 
>> I've not seen anything indicating that anyone has
>> been trying TARGET_ARCH=powerpc. I've been trying
>> TARGET_ARCH=powerpc64 .
>> 
>> While I do not have access to a true
>> TARGET_ARCH=powerpc machine currently, such a build
>> can be used on a PowerMac G5 so-called "Quad Core".
>> So I could eventually build and try such on the one
>> powerpc family machine that I currently have access
>> to.
>> 
>> clang 3.9.1 has a significant code generation problem
>> for TARGET_ARCH=powerpc and so I'd have to use
>> a gcc 4.2.1 based build for that sort of experiment.
>> (There is no xtoolchain for 32-bit powerpc.)
>> 
>> I use clang 3.9.1 or xtoolchain for
>> TARGET_ARCH=powerpc64 and have been using clang 3.9.1
>> in recent times. My primary powerpc family use has
>> been to experiment with building based on the
>> modern libc++ and reporting issues discovered in the
>> attempts. This explains the clang/xtoolchain context.
>> 
>> clang 3.9.1 has