Re: iflib-if_em tests with HEAD and lagg panic [Was: Re: svn commit: r333338 - in stable/11/sys: dev/bnxt kern net sys]

2018-05-20 Thread Harry Schmalzbauer

Am 11.05.2018 um 19:24 schrieb Harry Schmalzbauer:

Bezüglich Stephen Hurd's Nachricht vom 10.05.2018 21:55 (localtime):

Ok, the review is updated with the EBR.  If you can update your tree to
r333466 or newer, apply the patch and retest, that would be great.  It
seems to be working here.

I took out the sys/net/if.c, sys/net/if_var.h and sys/conf/kmod.mk
hunks, since these were commited in r333469.

Happy to confirm that there are no more LORs occuring when
creating/using if_lagg(4), neither with the hartwell/clarkville sisters,
nor with kawela twins.
Tested with r333486 and latest https://reviews.freebsd.org/D15355 as of
this writing.
Brief balancing/failover tests also inidcate excellent condition for
those 3 iflib-NICs.
Only did very simple workloads (with IPv6 NFSv4), but so far nothing
uncommon in any area.
Also, I cannot reproduce the link status failure when removing the TP
connection of the i217 NIC (will update the separate thread).


I'd like to report additional nits with i217:

While trying to track down strange symptoms (IPSec transport mode seems 
broken, virtualbox bridge might possibly also be broken), I saw that 
disabling some offload features doesn't work.
ifconfig(8) doesn't report a failure, but I can't disable VLAN_HWCSUM 
and disablying rxcsum6 enables rxcsum4, while disabling rxcsum 
re-enables rxcsum6.


Is it possible at all that iflib affects IPsec transport mode?

Thanks,

-harry

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: iflib-if_em tests with HEAD and lagg panic [Was: Re: svn commit: r333338 - in stable/11/sys: dev/bnxt kern net sys]

2018-05-11 Thread Harry Schmalzbauer
Bezüglich Stephen Hurd's Nachricht vom 10.05.2018 21:55 (localtime):
> Ok, the review is updated with the EBR.  If you can update your tree to
> r333466 or newer, apply the patch and retest, that would be great.  It
> seems to be working here.

I took out the sys/net/if.c, sys/net/if_var.h and sys/conf/kmod.mk
hunks, since these were commited in r333469.

Happy to confirm that there are no more LORs occuring when
creating/using if_lagg(4), neither with the hartwell/clarkville sisters,
nor with kawela twins.
Tested with r333486 and latest https://reviews.freebsd.org/D15355 as of
this writing.
Brief balancing/failover tests also inidcate excellent condition for
those 3 iflib-NICs.
Only did very simple workloads (with IPv6 NFSv4), but so far nothing
uncommon in any area.
Also, I cannot reproduce the link status failure when removing the TP
connection of the i217 NIC (will update the separate thread).


Thanks a lot for that quick fix,

-Harry

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: iflib-if_em tests with HEAD and lagg panic [Was: Re: svn commit: r333338 - in stable/11/sys: dev/bnxt kern net sys]

2018-05-10 Thread Stephen Hurd
Ok, the review is updated with the EBR.  If you can update your tree to
r333466 or newer, apply the patch and retest, that would be great.  It
seems to be working here.

On Thu, May 10, 2018 at 2:29 PM, Harry Schmalzbauer 
wrote:

> Bezüglich Stephen Hurd's Nachricht vom 10.05.2018 20:07 (localtime):
> > No need to test the latest revision unless/until you get a LOR or a
> > panic (both should be possible with the revision you currently have).
> > With the recent addition of a simple EBR API, mmacy@ is working on a
> > better solution.  If possible, it would be great to have you re-test it
> > once that is up.
>
>
> Since this literally brand new _haswell_ (DH87MC) box is waiting now for
> 3 years to replace my 10 years old core2duo, I'll keep it on the tinker
> bench...
>
> I'd have some "smbios0: SMBIOS checksum failed" and
> "[drm2:pid:hangcheck_hung]...GPU hung" issues to track/report, but since
> finding Xorg's secret about "MatchIsKeyboard" took me too much time,
> this is postponed.
>
> -harry
>



-- 
[image: Limelight Networks] 
Stephen Hurd* Principal Engineer*
EXPERIENCE FIRST.
+1 616 848 0643 <+1+616+848+0643>
www.limelight.com
[image: Facebook] [image:
LinkedIn] [image:
Twitter] 
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: iflib-if_em tests with HEAD and lagg panic [Was: Re: svn commit: r333338 - in stable/11/sys: dev/bnxt kern net sys]

2018-05-10 Thread Harry Schmalzbauer
Bezüglich Stephen Hurd's Nachricht vom 10.05.2018 20:07 (localtime):
> No need to test the latest revision unless/until you get a LOR or a
> panic (both should be possible with the revision you currently have). 
> With the recent addition of a simple EBR API, mmacy@ is working on a
> better solution.  If possible, it would be great to have you re-test it
> once that is up.


Since this literally brand new _haswell_ (DH87MC) box is waiting now for
3 years to replace my 10 years old core2duo, I'll keep it on the tinker
bench...

I'd have some "smbios0: SMBIOS checksum failed" and
"[drm2:pid:hangcheck_hung]...GPU hung" issues to track/report, but since
finding Xorg's secret about "MatchIsKeyboard" took me too much time,
this is postponed.

-harry
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: iflib-if_em tests with HEAD and lagg panic [Was: Re: svn commit: r333338 - in stable/11/sys: dev/bnxt kern net sys]

2018-05-10 Thread Stephen Hurd
No need to test the latest revision unless/until you get a LOR or a panic
(both should be possible with the revision you currently have).  With the
recent addition of a simple EBR API, mmacy@ is working on a better
solution.  If possible, it would be great to have you re-test it once that
is up.

On Thu, May 10, 2018 at 2:02 PM, Harry Schmalzbauer 
wrote:

>  Bezüglich Harry Schmalzbauer's Nachricht vom 10.05.2018 19:54 (localtime):
> …
> > Please excuse that I'm not familar with the phabricator and just did
> > "raw diff download" after briefly flying over the comments.
> > According to st_mtime this was on May 9th, 08:14:02 UTC (10:14 local
> > (CEST) time).
> > No idea what timezone phabricator reports to me, most likely respecting
> > local time.  Which means latest revision was part of my test – but I'm
>
> Oh, I missed "pm".
> Phabricator reports Wed, May 9, 9:49 PM
>  as last revision, my
> download was from 10:14 _AM_...
> But if the web site was respecting local time zone, I guess it would
> respect local time format, so not PM, but 21:49...
> So I think it's most likely any UTC+>2 time, so the revision I tested
> was probably Diff 42294 .
>
> Just tell me if it was useful for you to test the latest revision again.
>
> -harry
>



-- 
[image: Limelight Networks] 
Stephen Hurd* Principal Engineer*
EXPERIENCE FIRST.
+1 616 848 0643 <+1+616+848+0643>
www.limelight.com
[image: Facebook] [image:
LinkedIn] [image:
Twitter] 
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: iflib-if_em tests with HEAD and lagg panic [Was: Re: svn commit: r333338 - in stable/11/sys: dev/bnxt kern net sys]

2018-05-10 Thread Harry Schmalzbauer
 Bezüglich Harry Schmalzbauer's Nachricht vom 10.05.2018 19:54 (localtime):
…
> Please excuse that I'm not familar with the phabricator and just did
> "raw diff download" after briefly flying over the comments.
> According to st_mtime this was on May 9th, 08:14:02 UTC (10:14 local
> (CEST) time).
> No idea what timezone phabricator reports to me, most likely respecting
> local time.  Which means latest revision was part of my test – but I'm

Oh, I missed "pm".
Phabricator reports Wed, May 9, 9:49 PM
 as last revision, my
download was from 10:14 _AM_...
But if the web site was respecting local time zone, I guess it would
respect local time format, so not PM, but 21:49...
So I think it's most likely any UTC+>2 time, so the revision I tested
was probably Diff 42294 .

Just tell me if it was useful for you to test the latest revision again.

-harry
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: iflib-if_em tests with HEAD and lagg panic [Was: Re: svn commit: r333338 - in stable/11/sys: dev/bnxt kern net sys]

2018-05-10 Thread Harry Schmalzbauer
Bezüglich Stephen Hurd's Nachricht vom 08.05.2018 20:58 (localtime):
> Can you test the review here: https://reviews.freebsd.org/D15355
> 
> It looks like there are two different locks protecting the same data
> everywhere but in lagg_ioctl().  This is a rough first-pass, and there may
> be some lingering recursion and performance regressions with it.
> 
…
>> Sleeping on "e1000_delay" with the following non-sleepable locks held:
>> exclusive rm if_lagg rmlock (if_lagg rmlock) r = 0 (0xf80014228c08)
>> locked @ /usr/src/sys/net/if_lagg.c:1433
>> stack backtrace:
>> #0 0x80701113 at witness_debugger+0x73
>> #1 0x807024f1 at witness_warn+0x461
>> #2 0x806a42cc at _sleep+0x6c
>> #3 0x806a4b34 at pause_sbt+0x144
>> #4 0x80440e21 at e1000_write_phy_reg_mdic+0xf1
>> #5 0x804446bf at e1000_enable_phy_wakeup_reg_access_bm+0x2f
>> #6 0x80432e0a at e1000_update_mc_addr_list_pch2lan+0x3a
>> #7 0x8041408f at em_if_multi_set+0x1bf
>> #8 0x807bc02e at iflib_if_ioctl+0xfe
>> #9 0x82111a15 at lagg_ioctl+0x115
>> #10 0x807dd348 at inm_release_task+0x218
>> #11 0x806dea29 at gtaskqueue_run_locked+0x139
>> #12 0x806de7a8 at gtaskqueue_thread_loop+0x88
>> #13 0x80659d84 at fork_exit+0x84
>> #14 0x809b767e at fork_trampoline+0xe
>> Sleeping thread (tid 100017, pid 0) owns a non-sleepable lock
>> KDB: stack backtrace of thread 100017:
>> sched_switch() at sched_switch+0x945/frame 0xfe00750dc5d0
>> mi_switch() at mi_switch+0x18c/frame 0xfe00750dc600
>> sleepq_switch() at sleepq_switch+0x10d/frame 0xfe00750dc640
>> sleepq_timedwait() at sleepq_timedwait+0x50/frame 0xfe00750dc680
>> _sleep() at _sleep+0x307/frame 0xfe00750dc730
>> pause_sbt() at pause_sbt+0x144/frame 0xfe00750dc780
>> e1000_write_phy_reg_mdic() at e1000_write_phy_reg_mdic+0xf1/frame
>> 0xfe00750dc7c0
>> e1000_enable_phy_wakeup_reg_access_bm() at
>> e1000_enable_phy_wakeup_reg_access_bm+0x2f/frame 0xfe00750dc7e0
>> e1000_update_mc_addr_list_pch2lan() at
>> e1000_update_mc_addr_list_pch2lan+0x3a/frame 0xfe00750dc820
>> em_if_multi_set() at em_if_multi_set+0x1bf/frame 0xfe00750dc870
>> iflib_if_ioctl() at iflib_if_ioctl+0xfe/frame 0xfe00750dc8e0
>> lagg_ioctl() at lagg_ioctl+0x115/frame 0xfe00750dc990
>> inm_release_task() at inm_release_task+0x218/frame 0xfe00750dc9f0
>> gtaskqueue_run_locked() at gtaskqueue_run_locked+0x139/frame
>> 0xfe00750dca40
>> gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x88/frame
>> 0xfe00750dca70
>> fork_exit() at fork_exit+0x84/frame 0xfe00750dcab0
>> fork_trampoline() at fork_trampoline+0xe/frame 0xfe00750dcab0
>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>> panic: sleeping thread
>> cpuid = 3
>> time = 1525794682
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>> 0xfe008fe180e0
>> vpanic() at vpanic+0x1a3/frame 0xfe008fe18140
>> panic() at panic+0x43/frame 0xfe008fe181a0
>> propagate_priority() at propagate_priority+0x335/frame 0xfe008fe181e0
>> turnstile_wait() at turnstile_wait+0x38d/frame 0xfe008fe18230
>> __mtx_lock_sleep() at __mtx_lock_sleep+0x1e1/frame 0xfe008fe182b0
>> __mtx_lock_flags() at __mtx_lock_flags+0xf9/frame 0xfe008fe18300
>> _rm_rlock() at _rm_rlock+0x280/frame 0xfe008fe18330
>> _rm_rlock_debug() at _rm_rlock_debug+0x14c/frame 0xfe008fe18380
>> lagg_transmit() at lagg_transmit+0x38/frame 0xfe008fe183f0
>> ether_output_frame() at ether_output_frame+0xaa/frame 0xfe008fe18420
>> ether_output() at ether_output+0x68b/frame 0xfe008fe184c0
>> arprequest() at arprequest+0x474/frame 0xfe008fe185c0
>> arp_ifinit() at arp_ifinit+0x58/frame 0xfe008fe18600
>> ether_ioctl() at ether_ioctl+0x1d1/frame 0xfe008fe18630
>> lagg_ioctl() at lagg_ioctl+0x602/frame 0xfe008fe186e0
>> in_control() at in_control+0x8f5/frame 0xfe008fe18780
>> ifioctl() at ifioctl+0x19c6/frame 0xfe008fe18850
>> kern_ioctl() at kern_ioctl+0x2b9/frame 0xfe008fe188b0
>> sys_ioctl() at sys_ioctl+0x168/frame 0xfe008fe18980
>> amd64_syscall() at amd64_syscall+0x2cc/frame 0xfe008fe18ab0
>> fast_syscall_common() at fast_syscall_common+0x101/frame
>> 0xfe008fe18ab0
>> --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8004820ba, rsp =
>> 0x7fffe1c8, rbp = 0x7fffe210 ---
>> KDB: enter: panic

I can confirm that the D15355 version I tested eleminates that panic.
Also no LOR with em0+em1 as laggports.

From the kawela report:
> Bezüglich Kevin Bowling's Nachricht vom 08.05.2018 11:52 (localtime):
>> On Tue, May 8, 2018 at 2:43 AM, Harry Schmalzbauer
 wrote:
> …
>>> But if the simple iflib/hw-support test with kawela+hartwell helps I'm
>>> happy to do.
>>
>> At this point it would be helpful, we think e1000 is nearing pretty
>> good shape and I need to become familiar with any outstanding bugs.
>
> Here's the results for 

Re: iflib-if_em tests with HEAD and lagg panic [Was: Re: svn commit: r333338 - in stable/11/sys: dev/bnxt kern net sys]

2018-05-08 Thread Stephen Hurd
Can you test the review here: https://reviews.freebsd.org/D15355

It looks like there are two different locks protecting the same data
everywhere but in lagg_ioctl().  This is a rough first-pass, and there may
be some lingering recursion and performance regressions with it.

On Tue, May 8, 2018 at 1:37 PM, Harry Schmalzbauer 
wrote:

> Bezüglich Sean Bruno's Nachricht vom 08.05.2018 18:44 (localtime):
> >
> >
> > On 05/08/18 10:23, Harry Schmalzbauer wrote:
> >> Bezüglich Kevin Bowling's Nachricht vom 08.05.2018 11:52 (localtime):
> >> …
>  But if the simple iflib/hw-support test with kawela+hartwell helps I'm
>  happy to do.
> >>>
> >>> At this point it would be helpful, we think e1000 is nearing pretty
> >>> good shape and I need to become familiar with any outstanding bugs.
> >>
> >> I started with hartwell:
> >> em1: attach_pre capping queues at 2
> >>
> >> Current cap: 0x460b
> >> em1: using 1024 tx descriptors and 1024 rx descriptors
> >> em1: msix_init qsets capped at 2
> >> em1: pxm cpus: 2 queue msgs: 4 admincnt: 1
> >> em1: using 2 rx queues 2 tx queues
> >> em1: Using MSIX interrupts with 3 vectors
> >> em1: allocated for 2 tx_queues
> >> em1: allocated for 2 rx_queues
> >> em1: Ethernet address: 00:1b:21:3e:90:52
> >> em1: netmap queues/slots: TX 2/1024, RX 2/1024
> >> dev.em.1.iflib.driver_version: 7.6.1-k
> >> dev.em.1.queue_rx_1.rx_irq: 0
> >> dev.em.1.queue_rx_1.rxd_tail: 607
> >> dev.em.1.queue_rx_1.rxd_head: 21
> >> dev.em.1.queue_rx_0.rx_irq: 0
> >> dev.em.1.queue_rx_0.rxd_tail: 410
> >> dev.em.1.queue_rx_0.rxd_head: 412
> >> dev.em.1.queue_tx_1.tx_irq: 0
> >> dev.em.1.queue_tx_1.txd_tail: 8
> >> dev.em.1.queue_tx_1.txd_head: 8
> >> dev.em.1.queue_tx_0.tx_irq: 0
> >> dev.em.1.queue_tx_0.txd_tail: 428
> >> dev.em.1.queue_tx_0.txd_head: 428
> >>
> >> Looks good so far, no problems with simple line speed (NFS4) copies.
> >>
> >> According to the i217 (Clarkville) Datasheet, it also supports 2 queues:
> >> Table 63. Intel® Ethernet Controller I217 Capability PHY Address 01,
> >>   Page 776,Register 19
> >> But it probably was never supported, at least I haven't ever checked
> >> pre-iflib.
> >> Here's the clakville:
> >> em0: attach_pre capping queues at 1
> >> em0: using 1024 tx descriptors and 1024 rx descriptors
> >> em0: msix_init qsets capped at
> >> em0: PCIY_MSIX capability not found; or rid 0 == 0.
> >> em0: Using an MSI interrupt
> >> em0: allocated for 1 tx_queues
> >> em0: allocated for 1 rx_queues
> >> em0: Ethernet address: 54:be:f7:0b:d7:4e
> >> em0: netmap queues/slots: TX 1/1024, RX 1/1024
> >>
> >> Since it's not not effort here, I also tried LACP, which panicked.
> >> vmcore available, but what debugger to use these days? kgdb seems to be
> >> replaced...
> >>
> >> -harry
> >> _
> >
> > /usr/libexec/kgdb should be the old kgdb that you are used to.  Most of
> > us have switched to using devel/gdb from ports.
>
> Thanks, me stupid – it's in libexec, not in my path...
> Unfortunately I have no clue about those essential C tools, so it
> doesn't make much sense for me to waste energy installing devel/gdb ;-)
> While I'm wondering why/how LLVM/gdb can be mixed... pure lack of
> essentials :-(
>
> So back to iflib-if_em panic after setting up a if_lagg(4) interface
> (which consists of an addon 82574 and the on-board (PCH)+i217 NIC, which
> was assigned a locally administrated ethernet address and used as first
> laggport, so the private MAC was (successfully) set on both NICs)
> and firing dhclient to get a lease:
>
>
> Sleeping on "e1000_delay" with the following non-sleepable locks held:
> exclusive rm if_lagg rmlock (if_lagg rmlock) r = 0 (0xf80014228c08)
> locked @ /usr/src/sys/net/if_lagg.c:1433
> stack backtrace:
> #0 0x80701113 at witness_debugger+0x73
> #1 0x807024f1 at witness_warn+0x461
> #2 0x806a42cc at _sleep+0x6c
> #3 0x806a4b34 at pause_sbt+0x144
> #4 0x80440e21 at e1000_write_phy_reg_mdic+0xf1
> #5 0x804446bf at e1000_enable_phy_wakeup_reg_access_bm+0x2f
> #6 0x80432e0a at e1000_update_mc_addr_list_pch2lan+0x3a
> #7 0x8041408f at em_if_multi_set+0x1bf
> #8 0x807bc02e at iflib_if_ioctl+0xfe
> #9 0x82111a15 at lagg_ioctl+0x115
> #10 0x807dd348 at inm_release_task+0x218
> #11 0x806dea29 at gtaskqueue_run_locked+0x139
> #12 0x806de7a8 at gtaskqueue_thread_loop+0x88
> #13 0x80659d84 at fork_exit+0x84
> #14 0x809b767e at fork_trampoline+0xe
> Sleeping thread (tid 100017, pid 0) owns a non-sleepable lock
> KDB: stack backtrace of thread 100017:
> sched_switch() at sched_switch+0x945/frame 0xfe00750dc5d0
> mi_switch() at mi_switch+0x18c/frame 0xfe00750dc600
> sleepq_switch() at sleepq_switch+0x10d/frame 0xfe00750dc640
> sleepq_timedwait() at sleepq_timedwait+0x50/frame 0xfe00750dc680
> _sleep() at _sleep+0x307/frame 0xfe00750dc730
> pause_sbt() at pause_sbt+0x144/frame 

Re: iflib-if_em tests with HEAD and lagg panic [Was: Re: svn commit: r333338 - in stable/11/sys: dev/bnxt kern net sys]

2018-05-08 Thread Harry Schmalzbauer
Bezüglich Sean Bruno's Nachricht vom 08.05.2018 18:44 (localtime):
> 
> 
> On 05/08/18 10:23, Harry Schmalzbauer wrote:
>> Bezüglich Kevin Bowling's Nachricht vom 08.05.2018 11:52 (localtime):
>> …
 But if the simple iflib/hw-support test with kawela+hartwell helps I'm
 happy to do.
>>>
>>> At this point it would be helpful, we think e1000 is nearing pretty
>>> good shape and I need to become familiar with any outstanding bugs.
>>
>> I started with hartwell:
>> em1: attach_pre capping queues at 2
>>
>> Current cap: 0x460b
>> em1: using 1024 tx descriptors and 1024 rx descriptors
>> em1: msix_init qsets capped at 2
>> em1: pxm cpus: 2 queue msgs: 4 admincnt: 1
>> em1: using 2 rx queues 2 tx queues
>> em1: Using MSIX interrupts with 3 vectors
>> em1: allocated for 2 tx_queues
>> em1: allocated for 2 rx_queues
>> em1: Ethernet address: 00:1b:21:3e:90:52
>> em1: netmap queues/slots: TX 2/1024, RX 2/1024
>> dev.em.1.iflib.driver_version: 7.6.1-k
>> dev.em.1.queue_rx_1.rx_irq: 0
>> dev.em.1.queue_rx_1.rxd_tail: 607
>> dev.em.1.queue_rx_1.rxd_head: 21
>> dev.em.1.queue_rx_0.rx_irq: 0
>> dev.em.1.queue_rx_0.rxd_tail: 410
>> dev.em.1.queue_rx_0.rxd_head: 412
>> dev.em.1.queue_tx_1.tx_irq: 0
>> dev.em.1.queue_tx_1.txd_tail: 8
>> dev.em.1.queue_tx_1.txd_head: 8
>> dev.em.1.queue_tx_0.tx_irq: 0
>> dev.em.1.queue_tx_0.txd_tail: 428
>> dev.em.1.queue_tx_0.txd_head: 428
>>
>> Looks good so far, no problems with simple line speed (NFS4) copies.
>>
>> According to the i217 (Clarkville) Datasheet, it also supports 2 queues:
>> Table 63. Intel® Ethernet Controller I217 Capability PHY Address 01,
>>   Page 776,Register 19
>> But it probably was never supported, at least I haven't ever checked
>> pre-iflib.
>> Here's the clakville:
>> em0: attach_pre capping queues at 1
>> em0: using 1024 tx descriptors and 1024 rx descriptors
>> em0: msix_init qsets capped at
>> em0: PCIY_MSIX capability not found; or rid 0 == 0.
>> em0: Using an MSI interrupt
>> em0: allocated for 1 tx_queues
>> em0: allocated for 1 rx_queues
>> em0: Ethernet address: 54:be:f7:0b:d7:4e
>> em0: netmap queues/slots: TX 1/1024, RX 1/1024
>>
>> Since it's not not effort here, I also tried LACP, which panicked.
>> vmcore available, but what debugger to use these days? kgdb seems to be
>> replaced...
>>
>> -harry
>> _
> 
> /usr/libexec/kgdb should be the old kgdb that you are used to.  Most of
> us have switched to using devel/gdb from ports.

Thanks, me stupid – it's in libexec, not in my path...
Unfortunately I have no clue about those essential C tools, so it
doesn't make much sense for me to waste energy installing devel/gdb ;-)
While I'm wondering why/how LLVM/gdb can be mixed... pure lack of
essentials :-(

So back to iflib-if_em panic after setting up a if_lagg(4) interface
(which consists of an addon 82574 and the on-board (PCH)+i217 NIC, which
was assigned a locally administrated ethernet address and used as first
laggport, so the private MAC was (successfully) set on both NICs)
and firing dhclient to get a lease:


Sleeping on "e1000_delay" with the following non-sleepable locks held:
exclusive rm if_lagg rmlock (if_lagg rmlock) r = 0 (0xf80014228c08)
locked @ /usr/src/sys/net/if_lagg.c:1433
stack backtrace:
#0 0x80701113 at witness_debugger+0x73
#1 0x807024f1 at witness_warn+0x461
#2 0x806a42cc at _sleep+0x6c
#3 0x806a4b34 at pause_sbt+0x144
#4 0x80440e21 at e1000_write_phy_reg_mdic+0xf1
#5 0x804446bf at e1000_enable_phy_wakeup_reg_access_bm+0x2f
#6 0x80432e0a at e1000_update_mc_addr_list_pch2lan+0x3a
#7 0x8041408f at em_if_multi_set+0x1bf
#8 0x807bc02e at iflib_if_ioctl+0xfe
#9 0x82111a15 at lagg_ioctl+0x115
#10 0x807dd348 at inm_release_task+0x218
#11 0x806dea29 at gtaskqueue_run_locked+0x139
#12 0x806de7a8 at gtaskqueue_thread_loop+0x88
#13 0x80659d84 at fork_exit+0x84
#14 0x809b767e at fork_trampoline+0xe
Sleeping thread (tid 100017, pid 0) owns a non-sleepable lock
KDB: stack backtrace of thread 100017:
sched_switch() at sched_switch+0x945/frame 0xfe00750dc5d0
mi_switch() at mi_switch+0x18c/frame 0xfe00750dc600
sleepq_switch() at sleepq_switch+0x10d/frame 0xfe00750dc640
sleepq_timedwait() at sleepq_timedwait+0x50/frame 0xfe00750dc680
_sleep() at _sleep+0x307/frame 0xfe00750dc730
pause_sbt() at pause_sbt+0x144/frame 0xfe00750dc780
e1000_write_phy_reg_mdic() at e1000_write_phy_reg_mdic+0xf1/frame
0xfe00750dc7c0
e1000_enable_phy_wakeup_reg_access_bm() at
e1000_enable_phy_wakeup_reg_access_bm+0x2f/frame 0xfe00750dc7e0
e1000_update_mc_addr_list_pch2lan() at
e1000_update_mc_addr_list_pch2lan+0x3a/frame 0xfe00750dc820
em_if_multi_set() at em_if_multi_set+0x1bf/frame 0xfe00750dc870
iflib_if_ioctl() at iflib_if_ioctl+0xfe/frame 0xfe00750dc8e0
lagg_ioctl() at lagg_ioctl+0x115/frame 0xfe00750dc990
inm_release_task() at inm_release_task+0x218/frame 

Re: iflib-if_em tests with HEAD and lagg panic [Was: Re: svn commit: r333338 - in stable/11/sys: dev/bnxt kern net sys]

2018-05-08 Thread Sean Bruno


On 05/08/18 10:23, Harry Schmalzbauer wrote:
> Bezüglich Kevin Bowling's Nachricht vom 08.05.2018 11:52 (localtime):
> …
>>> But if the simple iflib/hw-support test with kawela+hartwell helps I'm
>>> happy to do.
>>
>> At this point it would be helpful, we think e1000 is nearing pretty
>> good shape and I need to become familiar with any outstanding bugs.
> 
> I started with hartwell:
> em1: attach_pre capping queues at 2
> 
> Current cap: 0x460b
> em1: using 1024 tx descriptors and 1024 rx descriptors
> em1: msix_init qsets capped at 2
> em1: pxm cpus: 2 queue msgs: 4 admincnt: 1
> em1: using 2 rx queues 2 tx queues
> em1: Using MSIX interrupts with 3 vectors
> em1: allocated for 2 tx_queues
> em1: allocated for 2 rx_queues
> em1: Ethernet address: 00:1b:21:3e:90:52
> em1: netmap queues/slots: TX 2/1024, RX 2/1024
> dev.em.1.iflib.driver_version: 7.6.1-k
> dev.em.1.queue_rx_1.rx_irq: 0
> dev.em.1.queue_rx_1.rxd_tail: 607
> dev.em.1.queue_rx_1.rxd_head: 21
> dev.em.1.queue_rx_0.rx_irq: 0
> dev.em.1.queue_rx_0.rxd_tail: 410
> dev.em.1.queue_rx_0.rxd_head: 412
> dev.em.1.queue_tx_1.tx_irq: 0
> dev.em.1.queue_tx_1.txd_tail: 8
> dev.em.1.queue_tx_1.txd_head: 8
> dev.em.1.queue_tx_0.tx_irq: 0
> dev.em.1.queue_tx_0.txd_tail: 428
> dev.em.1.queue_tx_0.txd_head: 428
> 
> Looks good so far, no problems with simple line speed (NFS4) copies.
> 
> According to the i217 (Clarkville) Datasheet, it also supports 2 queues:
> Table 63. Intel® Ethernet Controller I217 Capability PHY Address 01,
>   Page 776,Register 19
> But it probably was never supported, at least I haven't ever checked
> pre-iflib.
> Here's the clakville:
> em0: attach_pre capping queues at 1
> em0: using 1024 tx descriptors and 1024 rx descriptors
> em0: msix_init qsets capped at
> em0: PCIY_MSIX capability not found; or rid 0 == 0.
> em0: Using an MSI interrupt
> em0: allocated for 1 tx_queues
> em0: allocated for 1 rx_queues
> em0: Ethernet address: 54:be:f7:0b:d7:4e
> em0: netmap queues/slots: TX 1/1024, RX 1/1024
> 
> Since it's not not effort here, I also tried LACP, which panicked.
> vmcore available, but what debugger to use these days? kgdb seems to be
> replaced...
> 
> -harry
> _

/usr/libexec/kgdb should be the old kgdb that you are used to.  Most of
us have switched to using devel/gdb from ports.

sean



signature.asc
Description: OpenPGP digital signature


iflib-if_em tests with HEAD and lagg panic [Was: Re: svn commit: r333338 - in stable/11/sys: dev/bnxt kern net sys]

2018-05-08 Thread Harry Schmalzbauer
Bezüglich Kevin Bowling's Nachricht vom 08.05.2018 11:52 (localtime):
…
>> But if the simple iflib/hw-support test with kawela+hartwell helps I'm
>> happy to do.
> 
> At this point it would be helpful, we think e1000 is nearing pretty
> good shape and I need to become familiar with any outstanding bugs.

I started with hartwell:
em1: attach_pre capping queues at 2

Current cap: 0x460b
em1: using 1024 tx descriptors and 1024 rx descriptors
em1: msix_init qsets capped at 2
em1: pxm cpus: 2 queue msgs: 4 admincnt: 1
em1: using 2 rx queues 2 tx queues
em1: Using MSIX interrupts with 3 vectors
em1: allocated for 2 tx_queues
em1: allocated for 2 rx_queues
em1: Ethernet address: 00:1b:21:3e:90:52
em1: netmap queues/slots: TX 2/1024, RX 2/1024
dev.em.1.iflib.driver_version: 7.6.1-k
dev.em.1.queue_rx_1.rx_irq: 0
dev.em.1.queue_rx_1.rxd_tail: 607
dev.em.1.queue_rx_1.rxd_head: 21
dev.em.1.queue_rx_0.rx_irq: 0
dev.em.1.queue_rx_0.rxd_tail: 410
dev.em.1.queue_rx_0.rxd_head: 412
dev.em.1.queue_tx_1.tx_irq: 0
dev.em.1.queue_tx_1.txd_tail: 8
dev.em.1.queue_tx_1.txd_head: 8
dev.em.1.queue_tx_0.tx_irq: 0
dev.em.1.queue_tx_0.txd_tail: 428
dev.em.1.queue_tx_0.txd_head: 428

Looks good so far, no problems with simple line speed (NFS4) copies.

According to the i217 (Clarkville) Datasheet, it also supports 2 queues:
Table 63. Intel® Ethernet Controller I217 Capability PHY Address 01,
  Page 776,Register 19
But it probably was never supported, at least I haven't ever checked
pre-iflib.
Here's the clakville:
em0: attach_pre capping queues at 1
em0: using 1024 tx descriptors and 1024 rx descriptors
em0: msix_init qsets capped at
em0: PCIY_MSIX capability not found; or rid 0 == 0.
em0: Using an MSI interrupt
em0: allocated for 1 tx_queues
em0: allocated for 1 rx_queues
em0: Ethernet address: 54:be:f7:0b:d7:4e
em0: netmap queues/slots: TX 1/1024, RX 1/1024

Since it's not not effort here, I also tried LACP, which panicked.
vmcore available, but what debugger to use these days? kgdb seems to be
replaced...

-harry
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"