Re: Instant panic while trying run ports-mgmt/poudriere

2015-08-22 Thread Andriy Gapon
On 12/08/2015 17:11, Lawrence Stewart wrote:
> On 08/07/15 07:33, Pawel Pekala wrote:
>> Hi K.,
>>
>> On 2015-08-06 12:33 -0700, "K. Macy"  wrote:
>>> Is this still happening?
>>
>> Still crashes:
> 
> +1 for me running r286617

Here is another +1 with r286922.
I can add a couple of bits of debugging data:

(kgdb) fr 8
#8  0x80639d60 in knote (list=0xf8019a733ea0,
hint=2147483648, lockflags=) at
/usr/src/sys/kern/kern_event.c:1964
1964} else if ((lockflags & KNF_NOKQLOCK) != 0) {
(kgdb) p *list
$2 = {kl_list = {slh_first = 0x0}, kl_lock = 0x8063a1e0
, kl_unlock = 0x8063a200 ,
  kl_assert_locked = 0x8063a220 ,
kl_assert_unlocked = 0x8063a240 ,
  kl_lockarg = 0xf8019a733bb0}
(kgdb) disassemble
Dump of assembler code for function knote:
0x80639d00 :   push   %rbp
0x80639d01 :   mov%rsp,%rbp
0x80639d04 :   push   %r15
0x80639d06 :   push   %r14
0x80639d08 :   push   %r13
0x80639d0a :  push   %r12
0x80639d0c :  push   %rbx
0x80639d0d :  sub$0x18,%rsp
0x80639d11 :  mov%edx,%r12d
0x80639d14 :  mov%rsi,-0x30(%rbp)
0x80639d18 :  mov%rdi,%rbx
0x80639d1b :  test   %rbx,%rbx
0x80639d1e :  je 0x80639ef6 
0x80639d24 :  mov%r12d,%eax
0x80639d27 :  and$0x1,%eax
0x80639d2a :  mov%eax,-0x3c(%rbp)
0x80639d2d :  mov0x28(%rbx),%rdi
0x80639d31 :  je 0x80639d38 
0x80639d33 :  callq  *0x18(%rbx)
0x80639d36 :  jmp0x80639d42 
0x80639d38 :  callq  *0x20(%rbx)
0x80639d3b :  mov0x28(%rbx),%rdi
0x80639d3f :  callq  *0x8(%rbx)
0x80639d42 :  mov%rbx,-0x38(%rbp)
0x80639d46 :  mov(%rbx),%rbx
0x80639d49 :  test   %rbx,%rbx
0x80639d4c :  je 0x80639ee5 
0x80639d52 :  and$0x2,%r12d
0x80639d56 :  nopw   %cs:0x0(%rax,%rax,1)
0x80639d60 :  mov0x28(%rbx),%r14

Panic is in the last quoted instruction.
And:
(kgdb) i reg
rax0x246582
rbx0xdeadc0dedeadc0de   -2401050962867404578
rcx0x0  0
rdx0x12e302
rsi0x80a26a5a   -2136839590
rdi0x80e81b80   -2132272256
rbp0xfe02b7efea20   0xfe02b7efea20
rsp0xfe02b7efe9e0   0xfe02b7efe9e0
r8 0x80a269ce   -2136839730
r9 0x80e82838   -2132269000
r100x1  65536
r110x80fabd10   -2131051248
r120x0  0
r130xf801ff84a818   -8787511171048
r140xf801ff84a800   -8787511171072
r150xf8019a6974f0   -8789207452432
rip0x80639d60   0x80639d60 
eflags 0x10286  66182

I think that $rbx stands out here (this is a kernel with INVARIANTS).

Looking at the code, is it possible that one of the calls from within
the loop's body modifies the list?  If that is so and provided that is a
valid behavior, then maybe using SLIST_FOREACH_SAFE would help.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Kernel panic with fresh current, probably nfs related

2015-08-22 Thread Sean Bruno



> I'm going to guess that you're using an "em" net driver, since that is the
> only one that sets if_hw_tsomax > IP_MAXPACKET (65535) from what I can see.
> 
> Sean, EM_TSO_SIZE is defined as (65535 + sizeof(struct ether_vlan_header)),
> which makes it > IP_MAXPACKET. The value of if_hw_tsomax must be <= 
> IP_MAXPACKET
> and I'm guessing this is what caused the above panic. (Someday it would be
> nice if TSO segments > IP_MAXPACKET could be handled, but that will take 
> changes
> in the ip layer and router software so that a bogus ip_len field doesn't cause
> problems.)
> 
> if_hw_tsomax needs to be the maximum segment size that the driver can accept
> from IP. Since the driver adds any MAC header after accepting the TSO segment
> from the IP layer, it shouldn't include MAC header(s) in the value for 
> if_hw_tsomax.
> (If its limit includes MAC header(s), it needs to subtract those out when 
> setting
>  if_hw_tsomax, not add them.)
> 
> Since I am working up a patch for the value of if_hw_tsomaxsegcount, I think 
> I'll
> add a check for > IP_MAXPACKET for if_hw_tsomax as well.
> 
> rick

Huh, ok.  You want to try something like this then?

sean


Index: if_em.h
===
--- if_em.h (revision 286991)
+++ if_em.h (working copy)
@@ -268,7 +268,7 @@

 #define EM_MAX_SCATTER 64
 #define EM_VFTA_SIZE   128
-#define EM_TSO_SIZE(65535 + sizeof(struct ether_vlan_header))
+#define EM_TSO_SIZE(65535 - sizeof(struct ether_vlan_header))
 #define EM_TSO_SEG_SIZE4096/* Max dma segment size */
 #define EM_MSIX_MASK   0x01F0 /* For 82574 use */
 #define EM_MSIX_LINK   0x0100 /* For 82574 use */
Index: if_lem.h
===
--- if_lem.h(revision 286991)
+++ if_lem.h(working copy)
@@ -238,7 +238,7 @@

 #define EM_MAX_SCATTER 64
 #define EM_VFTA_SIZE   128
-#define EM_TSO_SIZE(65535 + sizeof(struct ether_vlan_header))
+#define EM_TSO_SIZE(65535 - sizeof(struct ether_vlan_header))
 #define EM_TSO_SEG_SIZE4096/* Max dma segment size */
 #define EM_MSIX_MASK   0x01F0 /* For 82574 use */
 #define ETH_ZLEN   60

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Device random seems broken.

2015-08-22 Thread Mark R V Murray

> On 22 Aug 2015, at 13:49, Mark R V Murray  wrote:
> 
> 
>> On 22 Aug 2015, at 06:03, Steve Kargl  
>> wrote:
>> 
>> 
>> Please fix.
> 
> On its way.

Fixed.

A git commit was in the wrong review (D3197 instead of D3354). This is now 
committed as svn commit r287023.

M
-- 
Mark R V Murray

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Device random seems broken.

2015-08-22 Thread Steve Kargl
On Sat, Aug 22, 2015 at 02:13:18PM +0100, Mark R V Murray wrote:
> 
> > On 22 Aug 2015, at 13:49, Mark R V Murray  wrote:
> > 
> > 
> >> On 22 Aug 2015, at 06:03, Steve Kargl  
> >> wrote:
> >> 
> >> 
> >> Please fix.
> > 
> > On its way.
> 
> Fixed.
> 
> A git commit was in the wrong review (D3197 instead of D3354). This is now 
> committed as svn commit r287023.
> 

Thanks.

-- 
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Device random seems broken.

2015-08-22 Thread Mark R V Murray

> On 22 Aug 2015, at 06:03, Steve Kargl  
> wrote:
> 
> 
> Please fix.

On its way.

M
-- 
Mark R V Murray

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Kernel panic with fresh current, probably nfs related

2015-08-22 Thread Rick Macklem
Joel Dahl wrote:
> Hi,
> 
> I hit a kernel panic running a fresh -CURRENT today. This machine is my home
> NFS
> server and it exports src and obj to a bunch of other machines. During an
> installkernel on one of the other machines (using the src and obj exports
> from
> the NFS server) the NFS server kernel paniced.
> 
> I took a quick photo of the stack backtrace, since I didn't have time to
> investigate further (but I haven't rebooted the machine yet, it's still
> sitting at the db> prompt:
> 
>   http://mirror.vnode.se/upload/panic001-20150822.JPG
> 
> Any ideas?
> 
I'm going to guess that you're using an "em" net driver, since that is the
only one that sets if_hw_tsomax > IP_MAXPACKET (65535) from what I can see.

Sean, EM_TSO_SIZE is defined as (65535 + sizeof(struct ether_vlan_header)),
which makes it > IP_MAXPACKET. The value of if_hw_tsomax must be <= IP_MAXPACKET
and I'm guessing this is what caused the above panic. (Someday it would be
nice if TSO segments > IP_MAXPACKET could be handled, but that will take changes
in the ip layer and router software so that a bogus ip_len field doesn't cause
problems.)

if_hw_tsomax needs to be the maximum segment size that the driver can accept
from IP. Since the driver adds any MAC header after accepting the TSO segment
from the IP layer, it shouldn't include MAC header(s) in the value for 
if_hw_tsomax.
(If its limit includes MAC header(s), it needs to subtract those out when 
setting
 if_hw_tsomax, not add them.)

Since I am working up a patch for the value of if_hw_tsomaxsegcount, I think 
I'll
add a check for > IP_MAXPACKET for if_hw_tsomax as well.

rick

> --
> Joel
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Kernel panic with fresh current, probably nfs related

2015-08-22 Thread Rick Macklem
Joel Dahl wrote:
> Hi,
> 
> I hit a kernel panic running a fresh -CURRENT today. This machine is my home
> NFS
> server and it exports src and obj to a bunch of other machines. During an
> installkernel on one of the other machines (using the src and obj exports
> from
> the NFS server) the NFS server kernel paniced.
> 
> I took a quick photo of the stack backtrace, since I didn't have time to
> investigate further (but I haven't rebooted the machine yet, it's still
> sitting at the db> prompt:
> 
>   http://mirror.vnode.se/upload/panic001-20150822.JPG
> 
> Any ideas?
The panic is "tcp_output: len > IP_MAXPACKET". This would be a TCP/TSO problem
and not NFS. NFS just puts stuff on the TCP socket for transmission through the
kernel rpc layer. (It happens to do so in a way that the TSO code gets tested
in ways that a netperf test won't do.)

Take a look at the net device driver for your hardware and see if if_hw_tsomax
is set to > IP_MAXPACKET somehow. If it is, the value needs to be changed to
IP_MAXPACKET or less.

You can also try the attached patch for a related issue for net drivers that
can't handle 35 transmit segments for a TSO segment, although this shouldn't
problem wouldn't cause the above panic unless if_hw_tsomax wasn't set correctly,
from what I can see looking at the code.

If you just want to make the panic go away "disable TSO", but it would be nice
if we knew what net driver you were using and how this was caused?

rick

> 
> --
> Joel
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
> 
--- netinet/tcp_output.c.sav	2015-08-22 07:48:12.0 -0400
+++ netinet/tcp_output.c	2015-08-22 07:50:52.0 -0400
@@ -794,7 +794,13 @@ send:
 
 			/* extract TSO information */
 			if_hw_tsomax = tp->t_tsomax;
-			if_hw_tsomaxsegcount = tp->t_tsomaxsegcount;
+			/*
+			 * Subtract 1 for the tcp/ip header mbuf that
+			 * will be prepended to this mbuf chain after
+			 * the code in this section limits the number of
+			 * mbufs in the chain to if_hw_tsomaxsegcount.
+			 */
+			if_hw_tsomaxsegcount = tp->t_tsomaxsegcount - 1;
 			if_hw_tsomaxsegsize = tp->t_tsomaxsegsize;
 
 			/*
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Kernel panic with fresh current, probably nfs related

2015-08-22 Thread Joel Dahl
Hi,

I hit a kernel panic running a fresh -CURRENT today. This machine is my home NFS
server and it exports src and obj to a bunch of other machines. During an
installkernel on one of the other machines (using the src and obj exports from
the NFS server) the NFS server kernel paniced.

I took a quick photo of the stack backtrace, since I didn't have time to
investigate further (but I haven't rebooted the machine yet, it's still
sitting at the db> prompt:

  http://mirror.vnode.se/upload/panic001-20150822.JPG

Any ideas?

-- 
Joel
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Device random seems broken.

2015-08-22 Thread Steve Kargl
I have 

device  random  # Entropy device
options RANDOM_YARROW

in my kernel config file.  When rebuilding world, I'm
seeing 

In file included from 
/usr/src/sys/modules/random_fortuna/../../dev/random/randomdev.c:44:
/usr/src/sys/sys/random.h:39:2: error: "Cannot define both RANDOM_LOADABLE and
  RANDOM_YARROW"
#error "Cannot define both RANDOM_LOADABLE and RANDOM_YARROW"
 ^
1 error generated.
In file included from 
/usr/src/sys/modules/random_fortuna/../../dev/random/fortuna.c:45:
/usr/src/sys/sys/random.h:39:2: error: "Cannot define both RANDOM_LOADABLE and
  RANDOM_YARROW"
#error "Cannot define both RANDOM_LOADABLE and RANDOM_YARROW"
 ^
1 error generated.
mkdep: compile failed
*** Error code 1

Stop.
make[4]: stopped in /usr/src/sys/modules/random_fortuna
*** Error code 1

Please fix.

-- 
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"