Re: RACK/TCPHPTC: spell error in sources: fatal error missing option TCPHSTS in the build;

2018-06-11 Thread Jonathan T. Looney
On Mon, Jun 11, 2018 at 4:28 AM O. Hartmann  wrote:
>
> In the sources of CURRENT, there is a spell bug:
>
>
> [...]
> src/sys/modules/tcp/rack/../../../netinet/tcp_stacks/rack.c:131:1: error:
> unknown type name 'fatal' fatal error missing option TCPHSTS in the build;
> ^
>
/pool/sources/CURRENT/src/sys/modules/tcp/rack/../../../netinet/tcp_stacks/rack.c:131:12:
> error: expected ';' after top level declarator fatal error missing option
> TCPHSTS in the build; ^
>;
> [...]
>
> I face this nasty error when I try to compile GENERIC.

Thanks for this (and your many other) timely and high-quality problem
reports.

I believe r334949 should address this.

Jonathan
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Panic from ipfw_alloc_rule() after r334769 -> r334832

2018-06-08 Thread Jonathan T. Looney
On Fri, Jun 8, 2018 at 10:52 AM, Jonathan T. Looney  wrote:
>
> On Fri, Jun 8, 2018 at 9:38 AM, David Wolfskill 
wrote:
> >
> > Sorry for lack of much analysis; am at BSDCan.  jtl@ suggested that a
> > sequence of changes involving memory allocation and ipfw counters is
> > likely to be at issue.
>
> Just to be clear, I speculated that this seemed like it could be caused
by r334824.
>
> And, screen_1.jpg does indeed seem to point at that commit.

Yes, it is clear that this is hitting r334824.

V_ipfw_cntr_zone is defined with UMA_ZONE_PCPU.

ipfw_alloc_rule() allocates from V_ipfw_cntr_zone with M_ZERO.

That clearly violates the assertion added in r334824, as well as the
assumption behind the commit: "Nothing in the tree uses it..."

It seems like something will need to be changed here to resolve the
mismatch in assumptions/expectations... :-/

If anyone is hitting this bug and needs to get a working system in the
meantime, you'll need to revert the following commits which implemented (or
updated) this change:

r334830
r334829
r334824

Jonathan
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Panic from ipfw_alloc_rule() after r334769 -> r334832

2018-06-08 Thread Jonathan T. Looney
On Fri, Jun 8, 2018 at 9:38 AM, David Wolfskill 
wrote:
>
> Sorry for lack of much analysis; am at BSDCan.  jtl@ suggested that a
> sequence of changes involving memory allocation and ipfw counters is
> likely to be at issue.

Just to be clear, I speculated that this seemed like it could be caused by
r334824.

And, screen_1.jpg does indeed seem to point at that commit.

Jonathan
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: vm_phys_free_pages: page 0x... has unexpected order 0

2018-05-30 Thread Jonathan T. Looney
On Tue, May 29, 2018 at 10:18 AM, Andriy Gapon  wrote:

>
> [ping]
>
> On 21/05/2018 11:44, Andriy Gapon wrote:
> >
> > FreeBSD 12.0-CURRENT amd64 r332472
> > Does this panic ring a bell to anyone?
> > Has it already been fixed?
> > Thank you!
>

Is there any chance that r333703 fixes this problem for you? It fixes a
race that can manifest itself in other ways. I'm honestly not sure whether
it could also cause this problem.

Jonathan
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Occasional crashes, Bug or dying disk?

2016-02-02 Thread Jonathan T. Looney
On 2/2/16, 5:53 PM, "owner-freebsd-curr...@freebsd.org on behalf of
Christian Walther"  wrote:

>Hello list,
>
>since updating the current installation on my trusty Thinkpad T43 I
>keep getting occasional crashes like the following:
>
>#1  0xc0c23a03 in kern_reboot (howto=260)
>at /usr/src/sys/kern/kern_shutdown.c:364
>#2  0xc0c23f3d in vpanic (fmt=, ap=out>)
>at /usr/src/sys/kern/kern_shutdown.c:757
>#3  0xc0c23f5b in panic (fmt=0xc1590fad "ffs_blkfree_cg: freeing free
>frag")
>at /usr/src/sys/kern/kern_shutdown.c:688
>[...]
>I wonder if this might be a bug in FFS or a related subsystem, or my
>hard disk dying. smartctl lists several READ DMA und WRITE DMA related
>errors.

I (and several others) spent time trying to track down "the bug" causing
panics like this on a particular set of hardware. I finally wrote a
program that wrote and read byte patterns from the disk (using O_DIRECT).
I found that what the program read back didn't match what it wrote. In
circumstances like that, you can't expect any filesystem to work reliably.

Given the long pedigree of the UFS/FFS code, I think it is much more
likely to be a hardware problem than a problem with the UFS/FFS code.

Jonathan


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: sbappendstream 1 [head/amd64 @r293419]

2016-01-08 Thread Jonathan T. Looney
On 1/8/16, 9:05 AM, "David Wolfskill"  wrote:

>After the first panic, I rebuilt the kernel without -DNO_CLEAN; the
>crash dump & other diagnostic info is from the clean build.
>
>January  8, 2016 at 05:57:27 AM PST
>
>FreeBSD freebeast.catwhisker.org 11.0-CURRENT FreeBSD 11.0-CURRENT #1954
>r293419M/293420:1100093: Fri Jan  8 05:09:57 PST 2016
>r...@freebeast.catwhisker.org:/common/S4/obj/usr/src/sys/GENERIC  amd64
>
>panic: sbappendstream 1
>
>...
>Unread portion of the kernel message buffer:
>panic: sbappendstream 1
>cpuid = 7
>KDB: stack backtrace:
>db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>0xfe085e0595b0
>vpanic() at vpanic+0x182/frame 0xfe085e059630
>kassert_panic() at kassert_panic+0x126/frame 0xfe085e0596a0
>sbappendstream_locked() at sbappendstream_locked+0xa5/frame
>0xfe085e0596d0
>uipc_send() at uipc_send+0x942/frame 0xfe085e059780
>sosend_generic() at sosend_generic+0x42f/frame 0xfe085e059840
>kern_sendit() at kern_sendit+0x21b/frame 0xfe085e0598f0
>sendit() at sendit+0x126/frame 0xfe085e059940
>sys_sendmsg() at sys_sendmsg+0x61/frame 0xfe085e0599a0
>amd64_syscall() at amd64_syscall+0x2db/frame 0xfe085e059ab0
>Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe085e059ab0


The likely suspect here looks like r293405, which changed uipc_send() to
use sbappendstream_locked() instead of sbappend_locked().

However, I can't explain *why* that change is causing this problem without
further investigation.

Can you try reverting the change to see if that solves the problem you are
seeing?

Thanks!

Jonathan


>--- syscall (28, FreeBSD ELF64, sys_sendmsg), rip = 0x801270dfa, rsp =
>0x7fffa098, rbp = 0x7fffa0d0 ---
>KDB: enter: panic
>...
>Loaded symbols for /boot/kernel/autofs.ko
>#0  doadump (textdump=0) at pcpu.h:221
>221 pcpu.h: No such file or directory.
>in pcpu.h
>(kgdb) #0  doadump (textdump=0) at pcpu.h:221
>#1  0x8038205b in db_dump (dummy=,
>dummy2=false, 
>dummy3=0, dummy4=0x0) at /usr/src/sys/ddb/db_command.c:533
>#2  0x80381e4e in db_command (cmd_table=0x0)
>at /usr/src/sys/ddb/db_command.c:440
>#3  0x80381be4 in db_command_loop ()
>at /usr/src/sys/ddb/db_command.c:493
>#4  0x8038467b in db_trap (type=, code=0)
>at /usr/src/sys/ddb/db_main.c:251
>#5  0x80a5cfe3 in kdb_trap (type=3, code=0, tf=out>)
>at /usr/src/sys/kern/subr_kdb.c:654
>#6  0x80e6a2a8 in trap (frame=0xfe085e0594e0)
>at /usr/src/sys/amd64/amd64/trap.c:549
>#7  0x80e4a317 in calltrap ()
>at /usr/src/sys/amd64/amd64/exception.S:234
>#8  0x80a5c6cb in kdb_enter (why=0x8137af3c "panic",
>msg=0x80 ) at cpufunc.h:63
>#9  0x80a1fb8f in vpanic (fmt=,
>ap=) at /usr/src/sys/kern/kern_shutdown.c:750
>#10 0x80a1f9e6 in kassert_panic (fmt=)
>at /usr/src/sys/kern/kern_shutdown.c:647
>#11 0x80aa3375 in sbappendstream_locked (sb=0xf80044212378,
>m=0xf800108c7200, flags=0) at /usr/src/sys/kern/uipc_sockbuf.c:642
>#12 0x80ab1a42 in uipc_send (so=0xf80044212000, flags=0,
>m=, nam=0x0, control=,
>td=0xf8001078e9a0) at /usr/src/sys/kern/uipc_usrreq.c:984
>#13 0x80aa5f5f in sosend_generic (so=0xf80044212000,
>addr=0x0, 
>uio=0xfe085e059890, top=,
>control=, flags=,
>td=0xfe085e059880) at /usr/src/sys/kern/uipc_socket.c:1349
>#14 0x80aac36b in kern_sendit (td=0xf8001078e9a0, s=6,
>mp=, flags=0, control=0x0, segflg=UIO_USERSPACE)
>at /usr/src/sys/kern/uipc_syscalls.c:906
>#15 0x80aac666 in sendit (td=0xf8001078e9a0,
>s=, mp=0xfe085e059958, flags=0)
>at /usr/src/sys/kern/uipc_syscalls.c:833
>#16 0x80aac6f1 in sys_sendmsg (td=0xf8001078e9a0,
>uap=0xfe085e059a40) at /usr/src/sys/kern/uipc_syscalls.c:1035
>#17 0x80e6b13b in amd64_syscall (td=0xf8001078e9a0, traced=0)
>at subr_syscall.c:135
>#18 0x80e4a5fb in Xfast_syscall ()
>at /usr/src/sys/amd64/amd64/exception.S:394
>#19 0x000801270dfa in ?? ()
>Previous frame inner to this frame (corrupt stack?)
>Current language:  auto; currently minimal
>(kgdb) 
>.
>
>As indicated above, this is with a GENERIC kernel.  My laptop (running
>a kernel built with the same sources, but a slightly customized kernel
>config) gets to the point of allowing me to login (via xdm), but when I
>fire off a command that creates xterms & tries to run tmux(1) in them,
>locks up (as far as I can tell), and a power-cycle is needed to recover.
>
>I can poke at the crash dump (given hints), make the dump and core.txt
>file
>available.
>
>Peace,
>david
>-- 
>David H. Wolfskill da...@catwhisker.org
>Those who would murder in the name of God or prophet are blasphemous
>cowards.
>
>See http://www.catwhisker.org/~david/publickey.gpg for my public key.