I don't think it's being caused by interrupts, because the driver
instance for igb3 (where I'm receiving all the multicasts) is in polling
mode (I can't see any interrupts on cpu0 or cpu31, which is where igb3
is bound, in intrstat). So instead, the problem seems to be in the fact
that only a single thread is doing the polling. Dtracing which CPU's
running the ip_fanout_udp_multi_v4 routine, I can see that it's mostly
only on cpu1:
# dtrace -n 'fbt::ip_fanout_udp_multi_v4:entry{@[cpu]=count();}'
14 2
0 10544
1 1185177
The question is why are all multicasts handled only in a single soft
ring worker? Can it have something to do with igb only creating a single
rx ring group? As I said earlier, when I set rx_group_number>=2 in
igb.conf, igb panics at boot on a BAD TRAP as in the attached kernel
stack trace.
Cheers,
--
Saso
On 06/27/2012 05:30 PM, Sašo Kiselkov wrote:
> On 06/27/2012 04:58 PM, Garrett D'Amore wrote:
>> Mostly receive, or mostly transmit, traffic? echo ::interrupts | mdb -k
>> (to see interrupt bindings)
>
> Right now it's purely Rx. Later on, in production, this is going to be
> roughly a 20:1 split between Tx:Rx. My interrupts are mapped out as follows:
>
>
>> ::interrupts
> IRQ Vect IPL Bus Trg Type CPU Share APIC/INT# ISR(s)
> 3 0xb1 12 ISA Edg Fixed 5 1 0x0/0x3 asyintr
> 4 0xb0 12 ISA Edg Fixed 4 1 0x0/0x4 asyintr
> 9 0x80 9 PCI Lvl Fixed 1 1 0x0/0x9 acpi_wrapper_isr
> 16 0x84 9 PCI Lvl Fixed 9 2 0x0/0x10 ohci_intr, ohci_intr
> 17 0x82 9 PCI Lvl Fixed 7 1 0x0/0x11 ehci_intr
> 18 0x85 9 PCI Lvl Fixed 10 2 0x0/0x12 ohci_intr, ohci_intr
> 19 0x83 9 PCI Lvl Fixed 8 1 0x0/0x13 ehci_intr
> 88 0x81 7 PCI Edg MSI 2 1 - pcieb_intr_handler
> 89 0x40 5 PCI Edg MSI-X 3 1 - mrsas_isr
> 90 0x30 4 PCI Edg MSI 6 1 - pcieb_intr_handler
> 91 0x86 7 PCI Edg MSI 11 1 - pcieb_intr_handler
> 92 0x31 4 PCI Edg MSI 12 1 - pcieb_intr_handler
> 93 0x60 6 PCI Edg MSI-X 13 1 - igb_intr_tx_other
> 94 0x61 6 PCI Edg MSI-X 14 1 - igb_intr_rx
> 95 0x62 6 PCI Edg MSI-X 15 1 - igb_intr_tx_other
> 96 0x63 6 PCI Edg MSI-X 16 1 - igb_intr_rx
> 97 0x64 6 PCI Edg MSI-X 31 1 - igb_intr_tx_other
> 98 0x65 6 PCI Edg MSI-X 0 1 - igb_intr_rx
> 99 0x41 5 PCI Edg MSI 19 1 - mptsas_intr
> 100 0x42 5 PCI Edg MSI 20 1 - mptsas_intr
> 101 0x66 6 PCI Edg MSI-X 23 1 - igb_intr_tx_other
> 102 0x67 6 PCI Edg MSI-X 24 1 - igb_intr_rx
> 160 0xa0 0 Edg IPI all 0 - poke_cpu
> 208 0xd0 14 Edg IPI all 1 - kcpc_hw_overflow_intr
> 209 0xd1 14 Edg IPI all 1 - cbe_fire
> 210 0xd3 14 Edg IPI all 1 - cbe_fire
> 240 0xe0 15 Edg IPI all 1 - xc_serv
> 241 0xe1 15 Edg IPI all 1 - apic_error_intr
>
> The CPU being thrashed currently is cpu1 (after I did a "dladm
> set-linkprop cpus=0-31 igb3").
>
> Cheers,
> --
> Saso
panic[cpu1]/thread=ffffff21ebefec60: BAD TRAP: type=e (#pf Page fault)
rp=ffffff00f499a210 addr=ffffff2366f750a0
dladm: #pf Page fault
Bad kernel fault at addr=0xffffff2366f750a0
pid=73, pc=0xfffffffff80b8244, sp=0xffffff00f499a300, eflags=0x10202
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4:
406f8<osxsav,xmme,fxsr,pge,mce,pae,pse,de>
cr2: ffffff2366f750a0cr3: 1f058c7000cr8: c
rdi: ffffff21ebd64800 rsi: ffffffff rdx: 0
rcx: 1 r8: 2f637036 r9: ffffff21ebdbcef0
rax: ffffffff rbx: 0 rbp: ffffff00f499a350
r10: fffffffffb85ab28 r11: ffffff0000003000 r12: ffffff00f499a360
r13: ffffff21ebd64800 r14: ffffff21ebf9be48 r15: ffffff21ebfa6b58
fsb: 0 gsb: ffffff21ea4c9540 ds: 4b
es: 4b fs: 0 gs: 1c3
trp: e err: 0 rip: fffffffff80b8244
cs: 30 rfl: 10202 rsp: ffffff00f499a300
ss: 0
Warning - stack not written to the dump buffer
ffffff00f499a0f0 unix:die+10f ()
ffffff00f499a200 unix:trap+1799 ()
ffffff00f499a210 unix:cmntrap+e6 ()
ffffff00f499a350 igb:igb_fill_ring+ac ()
ffffff00f499a410 mac:mac_init_ring+94 ()
ffffff00f499a460 mac:mac_init_group+9a ()
ffffff00f499a540 mac:mac_init_rings+3b0 ()
ffffff00f499a5c0 mac:mac_register+4cc ()
ffffff00f499a5f0 igb:igb_register_mac+8a ()
ffffff00f499a630 igb:igb_attach+1bf ()
ffffff00f499a690 genunix:devi_attach+80 ()
ffffff00f499a6c0 genunix:attach_node+95 ()
ffffff00f499a700 genunix:i_ndi_config_node+c4 ()
ffffff00f499a720 genunix:i_ddi_attachchild+40 ()
ffffff00f499a760 genunix:devi_attach_node+ac ()
ffffff00f499a820 genunix:devi_config_one+2f3 ()
ffffff00f499a8a0 genunix:ndi_devi_config_one+d5 ()
ffffff00f499a950 genunix:resolve_pathname+19c ()
ffffff00f499a980 genunix:e_ddi_hold_devi_by_path+23 ()
ffffff00f499a9d0 genunix:hold_devi+10f ()
ffffff00f499aa00 genunix:ddi_hold_devi_by_instance+1d ()
ffffff00f499ab60 softmac:softmac_hold_device+9f ()
ffffff00f499aba0 unix:stubs_common_code+51 ()
ffffff00f499ac10 dld:drv_ioc_phys_attr+79 ()
ffffff00f499acc0 dld:drv_ioctl+190 ()
ffffff00f499ad00 genunix:cdev_ioctl+45 ()
ffffff00f499ad40 specfs:spec_ioctl+5a ()
ffffff00f499adc0 genunix:fop_ioctl+7b ()
ffffff00f499aec0 genunix:ioctl+18e ()
ffffff00f499af10 unix:brand_sys_syscall32+17a ()
syncing file systems... done
skipping system dump - no dump device configured
-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription:
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com