On 07/06/18 11:14, Johannes Lundberg wrote:
On Fri, Jul 6, 2018 at 9:49 AM Konstantin Belousov <kostik...@gmail.com>
wrote:

On Fri, Jul 06, 2018 at 09:52:24AM +0200, Niclas Zeising wrote:
On 07/06/18 00:02, Warner Losh wrote:


On Thu, Jul 5, 2018 at 1:44 PM, John Baldwin <j...@freebsd.org
<mailto:j...@freebsd.org>> wrote:

     On 7/5/18 12:36 PM, Konstantin Belousov wrote:
      > On Thu, Jul 05, 2018 at 09:12:24PM +0200, Hans Petter Selasky
wrote:
      >> On 07/05/18 20:59, Hans Petter Selasky wrote:
      >>> On 07/05/18 19:48, Pete Wright wrote:
      >>>>
      >>>>
      >>>> On 07/05/2018 10:10, John Baldwin wrote:
      >>>>> On 7/3/18 5:10 PM, Pete Wright wrote:
      >>>>>>
      >>>>>> On 07/03/2018 15:56, John Baldwin wrote:
      >>>>>>> On 7/3/18 3:34 PM, Pete Wright wrote:
      >>>>>>>> On 07/03/2018 15:29, John Baldwin wrote:
      >>>>>>>>> That seems like kgdb is looking at the wrong CPU.  Can
     you use
      >>>>>>>>> 'info threads' and look for threads not stopped in
     'sched_switch'
      >>>>>>>>> and get their backtraces?  You could also just do
'thread
     apply
      >>>>>>>>> all bt' and put that file at a URL if that is easiest.
      >>>>>>>>>
      >>>>>>>> sure thing John - here's a gist of "thread apply all bt"
      >>>>>>>>
      >>>>>>>>
     https://gist.github.com/gem-pete/d8d7ab220dc8781f0827f965f09d43ed
     <https://gist.github.com/gem-pete/d8d7ab220dc8781f0827f965f09d43ed

      >>>>>>> That doesn't look right at all.  Are you sure the kernel
     matches the
      >>>>>>> vmcore?  Also, which kgdb version are you using?
      >>>>>>>
      >>>>>> yea i agree that doesn't look right at all.  here is my
setup:
      >>>>>>
      >>>>>> $ which kgdb
      >>>>>> /usr/bin/kgdb
      >>>>>> $ kgdb
      >>>>>> GNU gdb 6.1.1 [FreeBSD]
      >>>>>> $ ls -lh /var/crash/vmcore.1
      >>>>>> -rw-------  1 root  wheel   1.6G Jul  3 15:03
     /var/crash/vmcore.1
      >>>>>> $ ls -l /usr/lib/debug/boot/kernel/kernel.debug
      >>>>>> -r-xr-xr-x  1 root  wheel  87840496 Jul  3 13:54
      >>>>>> /usr/lib/debug/boot/kernel/kernel.debug
      >>>>>>
      >>>>>> and i invoke kgdb like so:
      >>>>>> $ sudo kgdb /usr/lib/debug/boot/kernel/kernel.debug
     /var/crash/vmcore.1
      >>>>>>
      >>>>>> here's a gist of my full gdb session:
      >>>>>> http://termbin.com/krsn
      >>>>>>
      >>>>>> dunno - maybe i have a bad core dump?  regardless, more
than
     happy to
      >>>>>> help so let me know if i should try anything else or
patches
     etc..
      >>>>> Can you try installing gdb from ports and using
     /usr/local/bin/kgdb?
      >>>>>
      >>>>
      >>>> that seems to have done the trick, at least the output looks
more
      >>>> encouraging.
      >>>>
      >>>>   --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
      >>>> KDB: enter: panic
      >>>>
      >>>> __curthread () at ./machine/pcpu.h:231
      >>>> 231        __asm("movq %%gs:%1,%0" : "=r" (td)
      >>>>
      >>>>
      >>>> here's my full kgdb session:
      >>>> http://termbin.com/qa4f
      >>>>
      >>>> i don't see any threads not in "sched_switch" though :(
      >>>
      >>> Hi,
      >>>
      >>> The problem may be that the patch to enable atomic inlining
of all
      >>> macros forgot to set the SMP keyword which means SMP is not
     defined at
      >>> all for KLD's so all non-kernel atomic usage is with MPLOCKED
     empty!
      > Problem is that out-of-tree modules build does not have opt*.h
files
      > from the kernel.  UP config is a valid one, flipping some
option's
      > default value does not solve the problem.

     Yes, but using the lock prefix in a generic module is ok (it will
still
     work, just not quite as fast) whereas the lack of lock is fatal on
     SMP.  I would amend Hans' patch slightly to honor the opt_* setting
     for KLD_TIED (but that is only true if KLD_TIED means "built as
part of
     a kernel build, so has valid opt_foo.h headers" and not
     'a standalone module where someone put MODULES_TIED=1 on the
command
     line
     to make').


I agree with this default. It's sensible to default to (a) the most
popular thing and (b) thing that always works, especially when (a) and
(b) are identical.

Don't make me start the "Do we really need an SMP option, why not make
it always on" thread :) The number of relevant uniprocessor x86 boxes
that benefit from omitting SMP is so small as to be irrelevant, IMHO.
A
MP kernel runs just fine on them...

Warner

Where are we on this?
It is important to get it fixed, it's already been 4 days, which means 4
days of all modern FreeBSD desktop systems being broken, and possibly
other systems with kernel modules from ports as well.


Another question, how hard would it be to expose how the kernel was
built to modules built from ports, so that they can figure out stuff
like SMP and others, that might affect the module build?
Point the KERNBUILDDIR variable to the directory of the kernel build.
This is the directory where *.o and opt*.h are located.  Then everything
would just work.


Is the solution that we require everyone to build a kernel before they can
build the standalone modules or am I missing something here?


Hi,

Here is a temporary fix:
https://svnweb.freebsd.org/changeset/base/336025

Like Konstantin says this issue needs to be revisited.

--HPS
_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to