On Mon, 30 Apr 2012 22:31:26 +0000
John Adams <[email protected]> wrote:

> Dear e1000-devel,
> 
> I'm wondering what kernel versions people are happily using in
> production with the ixgbe driver?
> 
> I'm having network stability and performance issues with a 2.6.32-131
> modified Red Hat el6 on a quad core Xeon Jasper Forest cpu.  My nic is
> X520/82599 dual port.
> 
> I wonder if this could be an ixgbe or ioatdma problem.
> Ixgbe is not mentioned in my stack traces.  Hoping for advice.
> 
> I could try a later kernel, especially one recommended by a
> happy ixgbe user.

if you're having issues you could blacklist ioatdma.  It is really not
necessary, unless you were really benefiting from dca, which is
unlikely.

Someone should check if there are any bugzillas at redhat for ioatdma
 
> Any comment is much appreciated.
> 
> Here's what I see. (just one cpu for brevity). This has been reported when 
> using an old version of
> ixgbe as well as 3.9.15-NAPI.
> 
> ioatdma 0000:00:0a.1: Channel halted, chanerr = 2
> ioatdma 0000:00:0a.1: Channel halted, chanerr = 2
> ioatdma 0000:00:0a.1: Channel halted, chanerr = 2
> ioatdma 0000:00:0a.1: Channel halted, chanerr = 2
> ioatdma 0000:00:0a.1: Channel halted, chanerr = 2
> ioatdma 0000:00:0a.1: ioat2_timer_event: Channel halted (2)
> BUG: scheduling while atomic: process_name/6888/0x10000301
> Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler sunrpc tcp_htcp 
> sr_mod cdrom raid456 async_raid6_recov async_pq raid6_pq async_xor xor 
> async_memcpy async_tx dm_mod ses enclosure sg i2c_i801 i2c_core iTCO_wdt 
> iTCO_vendor_support e1000e ioatdma ixgbe(U) dca pm8001(U) libsas 
> scsi_transport_sas ext3 jbd mbcache sd_mod crc_t10dif usb_storage pata_acpi 
> ata_generic ata_piix [last unloaded: scsi_wait_scan]
> Pid: 6888, comm: process_name Not tainted 2.6.32-foo-0 #7
> Call Trace:
>  <IRQ>  [<ffffffff8104dab6>] ? __schedule_bug+0x66/0x70
>  [<ffffffff81477502>] ? thread_return+0x5db/0x779
>  [<ffffffff8104f05d>] ? scheduler_tick+0xdd/0x280
>  [<ffffffff810128e9>] ? read_tsc+0x9/0x20
>  [<ffffffff81090d03>] ? ktime_get+0x63/0xe0
>  [<ffffffff81029a2d>] ? lapic_next_event+0x1d/0x30
>  [<ffffffffa01c558c>] ? ioat2_timer_event+0x25c/0x270 [ioatdma]
>  [<ffffffff8105748a>] ? __cond_resched+0x2a/0x40
>  [<ffffffffa01c558c>] ? ioat2_timer_event+0x25c/0x270 [ioatdma]
>  [<ffffffff814777f0>] ? _cond_resched+0x30/0x40
>  [<ffffffff8100df96>] ? is_valid_bugaddr+0x16/0x40
>  [<ffffffff8124e4df>] ? report_bug+0x1f/0xc0
>  [<ffffffff8100f2af>] ? die+0x7f/0x90
>  [<ffffffff8147a184>] ? do_trap+0xc4/0x160
>  [<ffffffffa01c5330>] ? ioat2_timer_event+0x0/0x270 [ioatdma]
>  [<ffffffffa01c5330>] ? ioat2_timer_event+0x0/0x270 [ioatdma]
>  [<ffffffff8100ce55>] ? do_invalid_op+0x95/0xb0
>  [<ffffffffa01c558c>] ? ioat2_timer_event+0x25c/0x270 [ioatdma]
>  [<ffffffff8105ff11>] ? vprintk+0x1d1/0x4f0
>  [<ffffffff81028e89>] ? native_send_call_func_single_ipi+0x39/0x40
>  [<ffffffff8109c081>] ? generic_exec_single+0xb1/0xc0
>  [<ffffffff8100befb>] ? invalid_op+0x1b/0x20
>  [<ffffffffa01c5330>] ? ioat2_timer_event+0x0/0x270 [ioatdma]
>  [<ffffffffa01c558c>] ? ioat2_timer_event+0x25c/0x270 [ioatdma]
>  [<ffffffffa01c5579>] ? ioat2_timer_event+0x249/0x270 [ioatdma]
>  [<ffffffff810128e9>] ? read_tsc+0x9/0x20
>  [<ffffffff81071ea7>] ? run_timer_softirq+0x197/0x340
>  [<ffffffff810676a1>] ? __do_softirq+0xc1/0x1d0
>  [<ffffffff8100c26c>] ? call_softirq+0x1c/0x30
>  <EOI>  [<ffffffff8100dea5>] ? do_softirq+0x65/0xa0
>  [<ffffffff81067fe8>] ? local_bh_enable_ip+0x98/0xa0
>  [<ffffffff814798fb>] ? _spin_unlock_bh+0x1b/0x20
>  [<ffffffffa01c486f>] ? ioat2_cleanup_tasklet+0x8f/0xa0 [ioatdma]
>  [<ffffffffa01c3743>] ? ioat2_is_complete+0x83/0xd0 [ioatdma]
>  [<ffffffff8141c38f>] ? tcp_recvmsg+0x75f/0xe90
>  [<ffffffff81476f75>] ? thread_return+0x4e/0x779
>  [<ffffffff8143c55c>] ? inet_recvmsg+0x5c/0x90
>  [<ffffffff813d53b3>] ? sock_recvmsg+0x133/0x160
>  [<ffffffff81086100>] ? autoremove_wake_function+0x0/0x40
>  [<ffffffff8109810e>] ? futex_wake+0x10e/0x120
>  [<ffffffff8109a071>] ? do_futex+0x121/0xb00
>  [<ffffffff8104ed13>] ? perf_event_task_sched_out+0x33/0x80
>  [<ffffffff81168779>] ? fget_light+0x9/0x90
>  [<ffffffff813d570e>] ? sys_recvfrom+0xee/0x180
>  [<ffffffff810097ac>] ? __switch_to+0x1ac/0x320
>  [<ffffffff81476f75>] ? thread_return+0x4e/0x779
>  [<ffffffff8109aacb>] ? sys_futex+0x7b/0x170
>  [<ffffffff8100c5d5>] ? math_state_restore+0x45/0x60
>  [<ffffffff8100b132>] ? system_call_fastpath+0x16/0x1b
> ------------[ cut here ]------------
> kernel BUG at drivers/dma/ioat/dma_v2.c:315!
> 
> In my sources that line is in ioat2_timer_event and it looks like it
> thinks a setup problem happened elsewhere.
> 
> /* when halted due to errors check for channel
> * programming errors before advancing the completion state
> */
> if (is_ioat_halted(status)) {
> u32 chanerr;
> 
> chanerr = readl(chan->reg_base + IOAT_CHANERR_OFFSET);
> dev_err(to_dev(chan), "%s: Channel halted (%x)\n",
> __func__, chanerr);
> if (test_bit(IOAT_RUN, &chan->state))
> BUG_ON(is_ioat_bug(chanerr));
> else /* we never got off the ground */
> return;
> }
> 
> Thanks much,
> 
> 
> 


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to