subject:"\[E1000\-devel\] Memory Corruption with e1000"

Re: [E1000-devel] Memory Corruption with e1000

2013-06-07 Thread Peter LaDow

Sorry, I noted on the earlier post I only did a single reply, rather
than a reply-all.

On Thu, Jun 6, 2013 at 4:37 PM, Ronciak, John john.ronc...@intel.com wrote:
 We have some ideas and are working on a patch for you to try.  Since we won't 
 really be able to test it can you do that if we get it to you?  Do you know 
 how to patch a driver?  Or should we send  you the whole thing (a complete 
 new driver like you would get off of our SF site)?

I can apply the patch.

Will the patch be based upon the 7.3.21 that in in 3.0.80?  Or the newer 8.0.35?

Thanks,
Pete

--
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired

Re: [E1000-devel] Memory Corruption with e1000

2013-06-06 Thread Peter P Waskiewicz Jr

On 06/05/2013 08:34 PM, Peter LaDow wrote:
 On 6/5/13, Ronciak, John john.ronc...@intel.com wrote:
 So I have a couple of questions.  Does this happen with a non-preemptive
 kernel?  I understand that you probably need to use a preemptive kernel but
 for testing purposes it would be good to know.  We don't always test with
 preemptive kernels.
 Hmmm... If you mean no RT patches, then yes. On a vanilla 3.0.80 kernel.

What about the pre-emption behavior of the kernel?  Namely Processor 
type and Features - Preemption Model.  Are you using no preemption, or 
forced preemption?

-PJ


 When doing the up/down transitions is there system under test?  I mean
 sending and receiving packets?  If it is what is the load like?  Does
 changing the load make a difference?  Does stopping the network traffic
 first make a difference in the outcome?
 Yes, the load makes a difference. On a silent network (or no link at
 all) this does not occur. Our network is quite busy. It isn't sending
 much (perhaps DHCP discovers and some IPv6 stuff).

 Thanks,
 Pete

 --
 How ServiceNow helps IT people transform IT departments:
 1. A cloud service to automate IT design, transition and operations
 2. Dashboards that offer high-level views of enterprise services
 3. A single system of record for all IT processes
 http://p.sf.net/sfu/servicenow-d2d-j
 ___
 E1000-devel mailing list
 E1000-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/e1000-devel
 To learn more about Intel#174; Ethernet, visit 
 http://communities.intel.com/community/wired


--
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired

Re: [E1000-devel] Memory Corruption with e1000

2013-06-06 Thread Peter LaDow

On 6/6/13, Peter P Waskiewicz Jr peter.p.waskiewicz...@intel.com wrote:
 What about the pre-emption behavior of the kernel?  Namely Processor
 type and Features - Preemption Model.  Are you using no preemption, or
 forced preemption?

It is PREEMPT_FULL. I'll turn it off and give it a spin.

Thanks,
Pete

--
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired

Re: [E1000-devel] Memory Corruption with e1000

2013-06-06 Thread Peter LaDow

On Thu, Jun 6, 2013 at 12:30 AM, Peter P Waskiewicz Jr
peter.p.waskiewicz...@intel.com wrote:
 What about the pre-emption behavior of the kernel?  Namely Processor type
 and Features - Preemption Model.  Are you using no preemption, or forced
 preemption?

Ok.  I've done testing.  Yes, we were building with PREEMPT_FULL.
I've done some further testing and can re-create the problem on
vanilla, non-preempt kernels.  See below.

# uname -a
Linux (none) 3.0.80-rt108 #2 Thu Jun 6 16:09:35 UTC 2013 ppc GNU/Linux

And I still get the slab corruption leading up to the kernel panic:

Slab corruption: size-2048 start=ee2b2070, len=2048
Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
Last user: [c0208514](skb_release_data+0xb4/0xc8)
020: 6b 6b ff ff ff ff ff ff 00 0d ed 47 d9 87 81 00
030: 00 f2 08 06 00 01 08 00 06 04 00 01 00 0d ed 47
040: d9 87 0a f1 0a ea 00 00 00 00 00 00 0a f1 0a ea
050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
060: 00 00 09 81 d2 0f 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Next obj: start=ee2b2888, len=2048
Redzone: 0xd84156c5635688c0/0xd84156c5635688c0.
Last user: [c0209b8c](__netdev_alloc_skb+0x28/0x60)
000: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
Slab corruption: size-2048 start=ed401480, len=2048
Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
Last user: [c0208514](skb_release_data+0xb4/0xc8)
020: 6b 6b ff ff ff ff ff ff e0 db 55 e4 ce f9 08 00
030: 45 00 01 3e 3e 1a 00 00 80 11 ca c0 0a ca 0d 42
040: 0a ca 0d ff 00 8a 00 8a 01 2a a5 96 11 0e af 81
050: 0a ca 0d 42 00 8a 01 14 00 00 20 45 42 45 4f 45
060: 45 46 43 45 4c 45 50 45 44 45 49 45 4f 45 43 43
070: 41 43 41 43 41 43 41 43 41 41 41 00 20 46 44 45
Prev obj: start=ed400c68, len=2048
Redzone: 0xd84156c5635688c0/0xd84156c5635688c0.
Last user: [c0209b8c](__netdev_alloc_skb+0x28/0x60)
000: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
Unable to handle kernel paging request for data at address 0x20454c45
Faulting instruction address: 0xc0062498
Oops: Kernel access of bad area, sig: 11 [#1]
SEL35xx Platform
Modules linked in:
NIP: c0062498 LR: c02084d8 CTR: c000cbbc
REGS: ee85bc60 TRAP: 0300   Not tainted  (3.0.80-rt108)
MSR: 9032 EE,ME,IR,DR  CR: 24008248  XER: 
DAR: 20454c45, DSISR: 2000
TASK = ef3e5830[4616] 'ifconfig' THREAD: ee85a000
GPR00:  ee85bd10 ef3e5830 20454c45 2d746baa 05f2 0002 
GPR08: c03b14e4 ed7471a8 ee85bcd0 5c26  10087a48 bfe0e41c 10064ae4
GPR16: 10064bc0 bfe0e40c  bfe0e3f4 0228  8914 c019a488
GPR24: c019a9cc ed70f4b0 005c ed70f340 ef063120  0001 ee62bd30
NIP [c0062498] put_page+0x0/0x34
LR [c02084d8] skb_release_data+0x78/0xc8
Call Trace:
[ee85bd20] [c020810c] __kfree_skb+0x18/0xbc
[ee85bd30] [c0195734] e1000_clean_rx_ring+0x10c/0x1a4
[ee85bd60] [c01957f4] e1000_clean_all_rx_rings+0x28/0x54
[ee85bd70] [c0198d40] e1000_close+0x30/0xb4
[ee85bd90] [c0212408] __dev_close_many+0xa0/0xe0
[ee85bda0] [c02141a0] __dev_close+0x2c/0x4c
[ee85bdc0] [c0210a58] __dev_change_flags+0xb8/0x140
[ee85bde0] [c0212324] dev_change_flags+0x1c/0x60
[ee85be00] [c0267594] devinet_ioctl+0x2a4/0x700
[ee85be60] [c026839c] inet_ioctl+0xc8/0xfc
[ee85be70] [c02006d4] sock_ioctl+0x260/0x2a0
[ee85be90] [c009145c] vfs_ioctl+0x2c/0x58
[ee85bea0] [c0091bc8] do_vfs_ioctl+0x610/0x698
[ee85bf10] [c0091ca8] sys_ioctl+0x58/0x88
[ee85bf40] [c000e674] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xff35a3c
LR = 0xff359a0
Instruction dump:
419e0018 3c80c006 38630180 38842abc 38a0 4bfffe65 80010014 bbc10008
38210010 7c0803a6 4e800020 4b54
8003 7c691b78 700bc000 41a20008
Kernel panic - not syncing: Fatal exception
Call Trace:
[ee85bb90] [c0007b80] show_stack+0x58/0x154 (unreliable)
[ee85bbd0] [c001c3a8] panic+0xa8/0x1cc
[ee85bc20] [c000b1f0] die+0x178/0x19c
[ee85bc40] [c0011a44] bad_page_fault+0xe8/0xfc
[ee85bc50] [c000eb14] handle_page_fault+0x7c/0x80
--- Exception: 300 at put_page+0x0/0x34
LR = skb_release_data+0x78/0xc8
[ee85bd10] []   (null) (unreliable)
[ee85bd20] [c020810c] __kfree_skb+0x18/0xbc
[ee85bd30] [c0195734] e1000_clean_rx_ring+0x10c/0x1a4
[ee85bd60] [c01957f4] e1000_clean_all_rx_rings+0x28/0x54
[ee85bd70] [c0198d40] e1000_close+0x30/0xb4
[ee85bd90] [c0212408] __dev_close_many+0xa0/0xe0
[ee85bda0] [c02141a0] __dev_close+0x2c/0x4c
[ee85bdc0] [c0210a58] __dev_change_flags+0xb8/0x140
[ee85bde0] [c0212324] dev_change_flags+0x1c/0x60
[ee85be00] [c0267594] devinet_ioctl+0x2a4/0x700
[ee85be60] [c026839c] inet_ioctl+0xc8/0xfc
[ee85be70] [c02006d4] sock_ioctl+0x260/0x2a0
[ee85be90] [c009145c] vfs_ioctl+0x2c/0x58
[ee85bea0] [c0091bc8] do_vfs_ioctl+0x610/0x698
[ee85bf10] [c0091ca8] sys_ioctl+0x58/0x88
[ee85bf40] [c000e674] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xff35a3c
LR = 0xff359a0

And with a vanilla, no-preempt kernel:

# uname -a
Linux (none) 3.0.80 #5 Thu Jun 6 16:26:15 UTC 2013 ppc GNU/Linux

slab error in

Re: [E1000-devel] Memory Corruption with e1000

2013-06-06 Thread Jesse Brandeburg

On Thu, 6 Jun 2013 09:38:50 -0700
Peter LaDow pet...@gocougs.wsu.edu wrote:

 On Thu, Jun 6, 2013 at 12:30 AM, Peter P Waskiewicz Jr
 peter.p.waskiewicz...@intel.com wrote:
  What about the pre-emption behavior of the kernel?  Namely Processor type
  and Features - Preemption Model.  Are you using no preemption, or forced
  preemption?
 
 Ok.  I've done testing.  Yes, we were building with PREEMPT_FULL.
 I've done some further testing and can re-create the problem on
 vanilla, non-preempt kernels.  See below.
 
 # uname -a
 Linux (none) 3.0.80-rt108 #2 Thu Jun 6 16:09:35 UTC 2013 ppc GNU/Linux
 
 And I still get the slab corruption leading up to the kernel panic:
 
 Slab corruption: size-2048 start=ee2b2070, len=2048
 Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
 Last user: [c0208514](skb_release_data+0xb4/0xc8)
 020: 6b 6b ff ff ff ff ff ff 00 0d ed 47 d9 87 81 00

that is quite clearly a broadcast, seems to me maybe a vlan packet
0x8100 to maybe vlan 0xf2?

so this means that the receive unit of the e1000 is not being stopped
completely (or is restarted by something) but that the memory of the DMA
buffer (the 2kB allocation) is being freed and then still DMA'd to.

 030: 00 f2 08 06 00 01 08 00 06 04 00 01 00 0d ed 47
 040: d9 87 0a f1 0a ea 00 00 00 00 00 00 0a f1 0a ea
 050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 060: 00 00 09 81 d2 0f 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
 Next obj: start=ee2b2888, len=2048
 Redzone: 0xd84156c5635688c0/0xd84156c5635688c0.
 Last user: [c0209b8c](__netdev_alloc_skb+0x28/0x60)
 000: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
 Slab corruption: size-2048 start=ed401480, len=2048
 Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
 Last user: [c0208514](skb_release_data+0xb4/0xc8)
 020: 6b 6b ff ff ff ff ff ff e0 db 55 e4 ce f9 08 00
 030: 45 00 01 3e 3e 1a 00 00 80 11 ca c0 0a ca 0d 42

same thing here, but this is an IP packet.

this is clearly a network adapter putting frames into memory that has
been freed.

I will see if someone here can reproduce this issue, but it seems quite
clear what is happening, we just need to figure out why.


 040: 0a ca 0d ff 00 8a 00 8a 01 2a a5 96 11 0e af 81
 050: 0a ca 0d 42 00 8a 01 14 00 00 20 45 42 45 4f 45
 060: 45 46 43 45 4c 45 50 45 44 45 49 45 4f 45 43 43
 070: 41 43 41 43 41 43 41 43 41 41 41 00 20 46 44 45
 Prev obj: start=ed400c68, len=2048
 Redzone: 0xd84156c5635688c0/0xd84156c5635688c0.
 Last user: [c0209b8c](__netdev_alloc_skb+0x28/0x60)
 000: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
 010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
 Unable to handle kernel paging request for data at address 0x20454c45
 Faulting instruction address: 0xc0062498
 Oops: Kernel access of bad area, sig: 11 [#1]
 SEL35xx Platform
 Modules linked in:
 NIP: c0062498 LR: c02084d8 CTR: c000cbbc
 REGS: ee85bc60 TRAP: 0300   Not tainted  (3.0.80-rt108)
 MSR: 9032 EE,ME,IR,DR  CR: 24008248  XER: 
 DAR: 20454c45, DSISR: 2000
 TASK = ef3e5830[4616] 'ifconfig' THREAD: ee85a000
 GPR00:  ee85bd10 ef3e5830 20454c45 2d746baa 05f2 0002 
 GPR08: c03b14e4 ed7471a8 ee85bcd0 5c26  10087a48 bfe0e41c 10064ae4
 GPR16: 10064bc0 bfe0e40c  bfe0e3f4 0228  8914 c019a488
 GPR24: c019a9cc ed70f4b0 005c ed70f340 ef063120  0001 ee62bd30
 NIP [c0062498] put_page+0x0/0x34
 LR [c02084d8] skb_release_data+0x78/0xc8
 Call Trace:
 [ee85bd20] [c020810c] __kfree_skb+0x18/0xbc
 [ee85bd30] [c0195734] e1000_clean_rx_ring+0x10c/0x1a4
 [ee85bd60] [c01957f4] e1000_clean_all_rx_rings+0x28/0x54
 [ee85bd70] [c0198d40] e1000_close+0x30/0xb4
 [ee85bd90] [c0212408] __dev_close_many+0xa0/0xe0
 [ee85bda0] [c02141a0] __dev_close+0x2c/0x4c
 [ee85bdc0] [c0210a58] __dev_change_flags+0xb8/0x140
 [ee85bde0] [c0212324] dev_change_flags+0x1c/0x60
 [ee85be00] [c0267594] devinet_ioctl+0x2a4/0x700
 [ee85be60] [c026839c] inet_ioctl+0xc8/0xfc
 [ee85be70] [c02006d4] sock_ioctl+0x260/0x2a0
 [ee85be90] [c009145c] vfs_ioctl+0x2c/0x58
 [ee85bea0] [c0091bc8] do_vfs_ioctl+0x610/0x698
 [ee85bf10] [c0091ca8] sys_ioctl+0x58/0x88
 [ee85bf40] [c000e674] ret_from_syscall+0x0/0x38
 --- Exception: c01 at 0xff35a3c
 LR = 0xff359a0
 Instruction dump:
 419e0018 3c80c006 38630180 38842abc 38a0 4bfffe65 80010014 bbc10008
 38210010 7c0803a6 4e800020 4b54
 8003 7c691b78 700bc000 41a20008
 Kernel panic - not syncing: Fatal exception
 Call Trace:
 [ee85bb90] [c0007b80] show_stack+0x58/0x154 (unreliable)
 [ee85bbd0] [c001c3a8] panic+0xa8/0x1cc
 [ee85bc20] [c000b1f0] die+0x178/0x19c
 [ee85bc40] [c0011a44] bad_page_fault+0xe8/0xfc
 [ee85bc50] [c000eb14] handle_page_fault+0x7c/0x80
 --- Exception: 300 at put_page+0x0/0x34
 LR = skb_release_data+0x78/0xc8
 [ee85bd10] []   (null) (unreliable)
 [ee85bd20] [c020810c] __kfree_skb+0x18/0xbc
 [ee85bd30] [c0195734] e1000_clean_rx_ring+0x10c/0x1a4
 [ee85bd60] [c01957f4] e1000_clean_all_rx_rings+0x28/0x54

Re: [E1000-devel] Memory Corruption with e1000

2013-06-06 Thread Peter LaDow

On Thu, Jun 6, 2013 at 11:23 AM, Ronciak, John john.ronc...@intel.com wrote:
 I agree with Jesse but this driver has been in the field for a very long time 
 with no reports like this coming to us.  Can you send us the dmesg when this 
 is happening?  I want to see if there are messages from the driver like if 
 the down is being delayed somehow.  Or re-enabled.

I stripped out the up/down messages.  But yes, there are sometimes up
messages.  At the end is the complete dmesg output.  I've tweaked the
script to print whenever the interface is changed.  It appears that
the slab errors are when the interface comes down:

Bringing eth2 up...
e1000: eth2 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX
ADDRCONF(NETDEV_UP): eth2: link is not ready
ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
Tearing eth2 down...
slab error in verify_redzone_free(): cache `size-2048': memory outside
object was overwritten
Call Trace:
[ee275c70] [c0007b80] show_stack+0x58/0x154 (unreliable)
[ee275cb0] [c007bb0c] __slab_error+0x2c/0x3c
[ee275cc0] [c007c0d0] cache_free_debugcheck+0x184/0x274
[ee275cf0] [c007c36c] kfree+0x90/0x10c
[ee275d10] [c02079e4] skb_release_data+0xb4/0xc8
[ee275d20] [c02075dc] __kfree_skb+0x18/0xbc
[ee275d30] [c0194d50] e1000_clean_rx_ring+0x10c/0x1a4
[ee275d60] [c0194e10] e1000_clean_all_rx_rings+0x28/0x54
[ee275d70] [c019835c] e1000_close+0x30/0xb4
[ee275d90] [c02118d8] __dev_close_many+0xa0/0xe0
[ee275da0] [c0213670] __dev_close+0x2c/0x4c
[ee275dc0] [c020ff28] __dev_change_flags+0xb8/0x140
[ee275de0] [c02117f4] dev_change_flags+0x1c/0x60
[ee275e00] [c02669b4] devinet_ioctl+0x2a4/0x700
[ee275e60] [c02677bc] inet_ioctl+0xc8/0xfc
[ee275e70] [c01ffba4] sock_ioctl+0x260/0x2a0
[ee275e90] [c0090a80] vfs_ioctl+0x2c/0x58
[ee275ea0] [c00911ec] do_vfs_ioctl+0x610/0x698
[ee275f10] [c00912cc] sys_ioctl+0x58/0x88
[ee275f40] [c000e674] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xff35a3c
LR = 0xff359a0
ee2a97b8: redzone 1:0x300574f524b4752, redzone 2:0xd84156c5635688c0.
slab error in verify_redzone_free(): cache `size-2048': memory outside
object was overwritten
Call Trace:
[ee275c70] [c0007b80] show_stack+0x58/0x154 (unreliable)
[ee275cb0] [c007bb0c] __slab_error+0x2c/0x3c
[ee275cc0] [c007c0d0] cache_free_debugcheck+0x184/0x274
[ee275cf0] [c007c36c] kfree+0x90/0x10c
[ee275d10] [c02079e4] skb_release_data+0xb4/0xc8
[ee275d20] [c02075dc] __kfree_skb+0x18/0xbc
[ee275d30] [c0194d50] e1000_clean_rx_ring+0x10c/0x1a4
[ee275d60] [c0194e10] e1000_clean_all_rx_rings+0x28/0x54
[ee275d70] [c019835c] e1000_close+0x30/0xb4
[ee275d90] [c02118d8] __dev_close_many+0xa0/0xe0
[ee275da0] [c0213670] __dev_close+0x2c/0x4c
[ee275dc0] [c020ff28] __dev_change_flags+0xb8/0x140
[ee275de0] [c02117f4] dev_change_flags+0x1c/0x60
[ee275e00] [c02669b4] devinet_ioctl+0x2a4/0x700
[ee275e60] [c02677bc] inet_ioctl+0xc8/0xfc
[ee275e70] [c01ffba4] sock_ioctl+0x260/0x2a0
[ee275e90] [c0090a80] vfs_ioctl+0x2c/0x58
[ee275ea0] [c00911ec] do_vfs_ioctl+0x610/0x698
[ee275f10] [c00912cc] sys_ioctl+0x58/0x88
[ee275f40] [c000e674] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xff35a3c
LR = 0xff359a0
ee2a8fa0: redzone 1:0xd84156c5635688c0, redzone 2:0x534c4f545c42524f.
Bringing eth2 up...
e1000: eth2 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX
ADDRCONF(NETDEV_UP): eth2: link is not ready
ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
Tearing eth2 down...
Unable to handle kernel paging request for data at address 0x
Faulting instruction address: 0xc0061c64
Oops: Kernel access of bad area, sig: 11 [#1]
SEL35xx Platform
Modules linked in:
NIP: c0061c64 LR: c02079a8 CTR: c000cbbc
REGS: ee2c3c60 TRAP: 0300   Not tainted  (3.0.80)
MSR: 9032 EE,ME,IR,DR  CR: 24008248  XER: 
DAR: , DSISR: 2000
TASK = ed56dba0[4730] 'ifconfig' THREAD: ee2c2000
GPR00:  ee2c3d10 ed56dba0  2e6a2e2a 05f2 0002 
GPR08: ef3d8da0 ee6a3428 0800 f04d  10087a48 bfd6bb1c 10064ae4
GPR16: 10064bc0 bfd6bb0c  bfd6baf4 0228  8914 c0199aa4
GPR24: c0199fe8 ed70f4b0 0059 ed70f340 ef063120  0001 ee75e818
NIP [c0061c64] put_page+0x0/0x34
LR [c02079a8] skb_release_data+0x78/0xc8
Call Trace:
[ee2c3d20] [c02075dc] __kfree_skb+0x18/0xbc
[ee2c3d30] [c0194d50] e1000_clean_rx_ring+0x10c/0x1a4
[ee2c3d60] [c0194e10] e1000_clean_all_rx_rings+0x28/0x54
[ee2c3d70] [c019835c] e1000_close+0x30/0xb4
[ee2c3d90] [c02118d8] __dev_close_many+0xa0/0xe0
[ee2c3da0] [c0213670] __dev_close+0x2c/0x4c
[ee2c3dc0] [c020ff28] __dev_change_flags+0xb8/0x140
[ee2c3de0] [c02117f4] dev_change_flags+0x1c/0x60
[ee2c3e00] [c02669b4] devinet_ioctl+0x2a4/0x700
[ee2c3e60] [c02677bc] inet_ioctl+0xc8/0xfc
[ee2c3e70] [c01ffba4] sock_ioctl+0x260/0x2a0
[ee2c3e90] [c0090a80] vfs_ioctl+0x2c/0x58
[ee2c3ea0] [c00911ec] do_vfs_ioctl+0x610/0x698
[ee2c3f10] [c00912cc] sys_ioctl+0x58/0x88
[ee2c3f40] [c000e674] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xff35a3c
LR = 0xff359a0
Instruction dump:

Re: [E1000-devel] Memory Corruption with e1000

2013-06-06 Thread Ronciak, John

OK so a couple of thing kind of stand out.  What interface is the e1000 on?  
eth0? That's not being called out or you filtered it out from the dmesg.  Early 
on eth2 is the e1000 interface but later it's one of the Gianfar interfaces.  
Can you clear this up for us?

Also, it looks like you have a bonding configuration.  What interfaces are 
being bonded?  You also have a Gianfar NIC with 2 interfaces.  Is this still 
happening when no bonding is configured?  Does the problem occur when the 
Gianfar interfaces are down/inactive?  I'm just trying to narrow things down a 
bit.  I'd like this to be tried with just the e1000 driver being active to see 
if it's happening then.

Can you send the entire dmesg?  Is it too big to email?

Cheers,
John


 -Original Message-
 From: pla...@gmail.com [mailto:pla...@gmail.com] On Behalf Of Peter
 LaDow
 Sent: Thursday, June 06, 2013 12:40 PM
 To: Ronciak, John
 Cc: Brandeburg, Jesse; Waskiewicz Jr, Peter P; e1000-
 de...@lists.sourceforge.net
 Subject: Re: [E1000-devel] Memory Corruption with e1000
 
 On Thu, Jun 6, 2013 at 11:23 AM, Ronciak, John john.ronc...@intel.com
 wrote:
  I agree with Jesse but this driver has been in the field for a very
 long time with no reports like this coming to us.  Can you send us the
 dmesg when this is happening?  I want to see if there are messages from
 the driver like if the down is being delayed somehow.  Or re-enabled.
 
 I stripped out the up/down messages.  But yes, there are sometimes up
 messages.  At the end is the complete dmesg output.  I've tweaked the
 script to print whenever the interface is changed.  It appears that the
 slab errors are when the interface comes down:
 
 Bringing eth2 up...
 e1000: eth2 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX
 ADDRCONF(NETDEV_UP): eth2: link is not ready
 ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready Tearing eth2 down...
 slab error in verify_redzone_free(): cache `size-2048': memory outside
 object was overwritten Call Trace:
 [ee275c70] [c0007b80] show_stack+0x58/0x154 (unreliable) [ee275cb0]
 [c007bb0c] __slab_error+0x2c/0x3c [ee275cc0] [c007c0d0]
 cache_free_debugcheck+0x184/0x274 [ee275cf0] [c007c36c]
 kfree+0x90/0x10c [ee275d10] [c02079e4] skb_release_data+0xb4/0xc8
 [ee275d20] [c02075dc] __kfree_skb+0x18/0xbc [ee275d30] [c0194d50]
 e1000_clean_rx_ring+0x10c/0x1a4 [ee275d60] [c0194e10]
 e1000_clean_all_rx_rings+0x28/0x54
 [ee275d70] [c019835c] e1000_close+0x30/0xb4 [ee275d90] [c02118d8]
 __dev_close_many+0xa0/0xe0 [ee275da0] [c0213670] __dev_close+0x2c/0x4c
 [ee275dc0] [c020ff28] __dev_change_flags+0xb8/0x140 [ee275de0]
 [c02117f4] dev_change_flags+0x1c/0x60 [ee275e00] [c02669b4]
 devinet_ioctl+0x2a4/0x700 [ee275e60] [c02677bc] inet_ioctl+0xc8/0xfc
 [ee275e70] [c01ffba4] sock_ioctl+0x260/0x2a0 [ee275e90] [c0090a80]
 vfs_ioctl+0x2c/0x58 [ee275ea0] [c00911ec] do_vfs_ioctl+0x610/0x698
 [ee275f10] [c00912cc] sys_ioctl+0x58/0x88 [ee275f40] [c000e674]
 ret_from_syscall+0x0/0x38
 --- Exception: c01 at 0xff35a3c
 LR = 0xff359a0
 ee2a97b8: redzone 1:0x300574f524b4752, redzone 2:0xd84156c5635688c0.
 slab error in verify_redzone_free(): cache `size-2048': memory outside
 object was overwritten Call Trace:
 [ee275c70] [c0007b80] show_stack+0x58/0x154 (unreliable) [ee275cb0]
 [c007bb0c] __slab_error+0x2c/0x3c [ee275cc0] [c007c0d0]
 cache_free_debugcheck+0x184/0x274 [ee275cf0] [c007c36c]
 kfree+0x90/0x10c [ee275d10] [c02079e4] skb_release_data+0xb4/0xc8
 [ee275d20] [c02075dc] __kfree_skb+0x18/0xbc [ee275d30] [c0194d50]
 e1000_clean_rx_ring+0x10c/0x1a4 [ee275d60] [c0194e10]
 e1000_clean_all_rx_rings+0x28/0x54
 [ee275d70] [c019835c] e1000_close+0x30/0xb4 [ee275d90] [c02118d8]
 __dev_close_many+0xa0/0xe0 [ee275da0] [c0213670] __dev_close+0x2c/0x4c
 [ee275dc0] [c020ff28] __dev_change_flags+0xb8/0x140 [ee275de0]
 [c02117f4] dev_change_flags+0x1c/0x60 [ee275e00] [c02669b4]
 devinet_ioctl+0x2a4/0x700 [ee275e60] [c02677bc] inet_ioctl+0xc8/0xfc
 [ee275e70] [c01ffba4] sock_ioctl+0x260/0x2a0 [ee275e90] [c0090a80]
 vfs_ioctl+0x2c/0x58 [ee275ea0] [c00911ec] do_vfs_ioctl+0x610/0x698
 [ee275f10] [c00912cc] sys_ioctl+0x58/0x88 [ee275f40] [c000e674]
 ret_from_syscall+0x0/0x38
 --- Exception: c01 at 0xff35a3c
 LR = 0xff359a0
 ee2a8fa0: redzone 1:0xd84156c5635688c0, redzone 2:0x534c4f545c42524f.
 Bringing eth2 up...
 e1000: eth2 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX
 ADDRCONF(NETDEV_UP): eth2: link is not ready
 ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready Tearing eth2 down...
 Unable to handle kernel paging request for data at address 0x
 Faulting instruction address: 0xc0061c64
 Oops: Kernel access of bad area, sig: 11 [#1] SEL35xx Platform Modules
 linked in:
 NIP: c0061c64 LR: c02079a8 CTR: c000cbbc
 REGS: ee2c3c60 TRAP: 0300   Not tainted  (3.0.80)
 MSR: 9032 EE,ME,IR,DR  CR: 24008248  XER: 
 DAR: , DSISR: 2000
 TASK = ed56dba0[4730] 'ifconfig' THREAD: ee2c2000
 GPR00:  ee2c3d10 ed56dba0  2e6a2e2a

Re: [E1000-devel] Memory Corruption with e1000

2013-06-06 Thread Peter LaDow

On Thu, Jun 6, 2013 at 1:10 PM, Ronciak, John john.ronc...@intel.com wrote:
 OK so a couple of thing kind of stand out.  What interface is the e1000 on?  
 eth0? That's not being called out or you filtered it out from the dmesg.  
 Early on eth2 is the e1000 interface but later it's one of the Gianfar 
 interfaces.  Can you clear this up for us?

The interfaces do get renamed early in the boot process.  We use
ifrename to force the e1000 interface to eth2.  The gianfar are on
eth0 and eth1.

 Also, it looks like you have a bonding configuration.  What interfaces are 
 being bonded?  You also have a Gianfar NIC with 2 interfaces.  Is this still 
 happening when no bonding is configured?  Does the problem occur when the 
 Gianfar interfaces are down/inactive?  I'm just trying to narrow things down 
 a bit.  I'd like this to be tried with just the e1000 driver being active to 
 see if it's happening then.

Currently, there is no bonding configured at all.  While we do allow
bonding, there is currently no bonded interfaces.

I tried the up/down loop with the gianfar devices, and I do not get
the failure.  They are connected to the same network, and no problem.

I shutdown the gianfar adapters (eth0 and eth1) and re-ran the up/down
loop.  Still get the same panic.

 Can you send the entire dmesg?  Is it too big to email?

That was the entire dmesg output.

Thanks,
Pete

--
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired

Re: [E1000-devel] Memory Corruption with e1000

2013-06-06 Thread Ronciak, John

Hi Peter,

We have some ideas and are working on a patch for you to try.  Since we won't 
really be able to test it can you do that if we get it to you?  Do you know how 
to patch a driver?  Or should we send  you the whole thing (a complete new 
driver like you would get off of our SF site)?

Cheers,
John


 -Original Message-
 From: pla...@gmail.com [mailto:pla...@gmail.com] On Behalf Of Peter
 LaDow
 Sent: Thursday, June 06, 2013 1:22 PM
 To: Ronciak, John
 Cc: Brandeburg, Jesse; Waskiewicz Jr, Peter P; e1000-
 de...@lists.sourceforge.net
 Subject: Re: [E1000-devel] Memory Corruption with e1000
 
 On Thu, Jun 6, 2013 at 1:10 PM, Ronciak, John john.ronc...@intel.com
 wrote:
  OK so a couple of thing kind of stand out.  What interface is the
 e1000 on?  eth0? That's not being called out or you filtered it out
 from the dmesg.  Early on eth2 is the e1000 interface but later it's
 one of the Gianfar interfaces.  Can you clear this up for us?
 
 The interfaces do get renamed early in the boot process.  We use
 ifrename to force the e1000 interface to eth2.  The gianfar are on
 eth0 and eth1.
 
  Also, it looks like you have a bonding configuration.  What
 interfaces are being bonded?  You also have a Gianfar NIC with 2
 interfaces.  Is this still happening when no bonding is configured?
 Does the problem occur when the Gianfar interfaces are down/inactive?
 I'm just trying to narrow things down a bit.  I'd like this to be tried
 with just the e1000 driver being active to see if it's happening then.
 
 Currently, there is no bonding configured at all.  While we do allow
 bonding, there is currently no bonded interfaces.
 
 I tried the up/down loop with the gianfar devices, and I do not get the
 failure.  They are connected to the same network, and no problem.
 
 I shutdown the gianfar adapters (eth0 and eth1) and re-ran the up/down
 loop.  Still get the same panic.
 
  Can you send the entire dmesg?  Is it too big to email?
 
 That was the entire dmesg output.
 
 Thanks,
 Pete

--
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired

[E1000-devel] Memory Corruption with e1000

2013-06-05 Thread Peter LaDow

We are running a PPC system with an 82540EP that is causing kernel
panics when there is heavy traffic and the interface is brought  up
and/or down (we aren't sure which yet).

We are running 3.0.57-rt82, but we can re-create this issue reliably
with 3.0.80 and 3.0.80-rt109 with the base version included in the
kernel (which is 7.3.21-k8-NAPI).  However, I've also tried 8.0.35,
and get the same failure.

We've narrowed it down to this case and can reliably re-create the
issue with a tight loop, such as:

while :
do
  ip link set eth2 up
  sleep 10
  ip link set eth2 down
  sleep 10
done

I'm not sure where to look and any help would be appreciated.

In this loop we can reliably generate a kernel panic such as:

Unable to handle kernel paging request for data at address 0x20454a46
Faulting instruction address: 0xc0069924
Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT PPC Platform
Modules linked in:
NIP: c0069924 LR: c021cce0 CTR: c000cecc
REGS: ed4f1c60 TRAP: 0300   Not tainted  (3.0.80-rt108)
MSR: 9032 EE,ME,IR,DR  CR: 24008248  XER: 
DAR: 20454a46, DSISR: 2000
TASK = eda46780[3106] 'ifconfig' THREAD: ed4f
GPR00:  ed4f1d10 eda46780 20454a46 2d6fcc2a 05f2 0002 
GPR08: eda46780 ed6fd228 ed4f1cd0 90b1  10084718 bfcceaec 10062044
GPR16: 10062120 bfcceadc  bfcceac4 0228  8914 c01ac398
GPR24: c01ac8c8 ed066520 0061 ed0663a0 ef0448f0  0001 ed575580
NIP [c0069924] put_page+0x0/0x34
LR [c021cce0] skb_release_data+0x78/0xc8
Call Trace:
[ed4f1d20] [c021c914] __kfree_skb+0x18/0xbc
[ed4f1d30] [c01a7620] e1000_clean_rx_ring+0x10c/0x1a4
[ed4f1d60] [c01a76e0] e1000_clean_all_rx_rings+0x28/0x54
[ed4f1d70] [c01aac50] e1000_close+0x30/0xb4
[ed4f1d90] [c0226e2c] __dev_close_many+0xa0/0xe0
[ed4f1da0] [c0228c64] __dev_close+0x2c/0x4c
[ed4f1dc0] [c0225224] __dev_change_flags+0xb8/0x140
[ed4f1de0] [c0226d48] dev_change_flags+0x1c/0x60
[ed4f1e00] [c027e7f8] devinet_ioctl+0x2a4/0x700
[ed4f1e60] [c027f450] inet_ioctl+0xc8/0xfc
[ed4f1e70] [c02147b0] sock_ioctl+0x260/0x2a0
[ed4f1e90] [c009b468] vfs_ioctl+0x2c/0x58
[ed4f1ea0] [c009bc44] do_vfs_ioctl+0x64c/0x6d4
[ed4f1f10] [c009bd24] sys_ioctl+0x58/0x88
[ed4f1f40] [c000e954] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xff35a3c
LR = 0xff359a0
Instruction dump:
7c0802a6 3c80c007 3884a500 90010024 38a10008 3800 90010008 4b0d
80010024 38210020 7c0803a6 4e800020
8003 7c691b78 700bc000 41a20008
Kernel panic - not syncing: Fatal exception
Call Trace:
[ed4f1b90] [c0007ccc] show_stack+0x58/0x154 (unreliable)
[ed4f1bd0] [c001d744] panic+0xb0/0x1d8
[ed4f1c20] [c000b4b8] die+0x1ac/0x1d0
[ed4f1c40] [c0011e38] bad_page_fault+0xe8/0xfc
[ed4f1c50] [c000edf4] handle_page_fault+0x7c/0x80
--- Exception: 300 at put_page+0x0/0x34
LR = skb_release_data+0x78/0xc8
[ed4f1d10] []   (null) (unreliable)
[ed4f1d20] [c021c914] __kfree_skb+0x18/0xbc
[ed4f1d30] [c01a7620] e1000_clean_rx_ring+0x10c/0x1a4
[ed4f1d60] [c01a76e0] e1000_clean_all_rx_rings+0x28/0x54
[ed4f1d70] [c01aac50] e1000_close+0x30/0xb4
[ed4f1d90] [c0226e2c] __dev_close_many+0xa0/0xe0
[ed4f1da0] [c0228c64] __dev_close+0x2c/0x4c
[ed4f1dc0] [c0225224] __dev_change_flags+0xb8/0x140
[ed4f1de0] [c0226d48] dev_change_flags+0x1c/0x60
[ed4f1e00] [c027e7f8] devinet_ioctl+0x2a4/0x700
[ed4f1e60] [c027f450] inet_ioctl+0xc8/0xfc
[ed4f1e70] [c02147b0] sock_ioctl+0x260/0x2a0
[ed4f1e90] [c009b468] vfs_ioctl+0x2c/0x58
[ed4f1ea0] [c009bc44] do_vfs_ioctl+0x64c/0x6d4
[ed4f1f10] [c009bd24] sys_ioctl+0x58/0x88
[ed4f1f40] [c000e954] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xff35a3c
LR = 0xff359a0

When turning on SLAB checks, I see:

Slab corruption: size-16384 start=ed4ec000, len=16384
690: 6b 6b ff ff ff ff ff ff b8 ac 6f 99 bf 8b 08 00
6a0: 45 00 00 24 3f 34 00 00 80 11 ca cf 0a ca 0d 33
6b0: 0a ca 0d ff 06 cc 06 cf 00 10 bc 1d c5 0b 40 01
6c0: 00 10 00 33 00 00 00 00 00 00 00 00 00 00 3f dd
6d0: ed f8 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
ea0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ff ff ff ff ff ff
Slab corruption: size-2048 start=ed4e6570, len=2048
Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
Last user: [  (null)](0x0)
0c0: 6b 6b ff ff ff ff ff ff 5c 26 0a 41 81 27 08 00
0d0: 45 00 00 4e 7d 44 00 00 80 11 8c 79 0a ca 0d 4f
0e0: 0a ca 0d ff 00 89 00 89 00 3a b5 a7 be 71 01 10
0f0: 00 01 00 00 00 00 00 00 20 45 4c 45 43 45 50 46
100: 49 43 41 43 41 43 41 43 41 43 41 43 41 43 41 43
110: 41 43 41 43 41 43 41 41 41 00 00 20 00 01 02 5a
Next obj: start=ed4e6d88, len=2048
Redzone: 0xd84156c5635688c0/0xd84156c5635688c0.
Last user: [c021e294](__netdev_alloc_skb+0x28/0x60)
000: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
010: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
Slab corruption: size-2048 start=ed54eb48, len=2048
Redzone: 0x9f911029d74e35b/0x9f911029d74e35b.
Last user: [c021cd1c](skb_release_data+0xb4/0xc8)
020: 6b 6b ff ff ff ff ff ff 18 03 73 e4 64 18 08 00
030: 45 00 00 44 61 b8 00 00 80 11 a7 c9 0a ca 0d 95
040: 0a ca

Re: [E1000-devel] Memory Corruption with e1000

2013-06-05 Thread Peter LaDow

On Wed, Jun 5, 2013 at 3:01 PM, Peter LaDow pet...@gocougs.wsu.edu wrote:
 After some more digging, I'm wondering if this is indeed a timing
 issue.  Is there a problem with bringing up an interface too soon
 after taking it down?  If I change my loop to use a 30 second delay
 between interface bringup/teardown, I don't get the panic.

Scratch that.  A 30 second delay didn't eliminate the problem.  It
only delayed it.  I finally got a similar failure.  I further
increased the time and got another failure, slightly different:

[ cut here ]
WARNING: at include/linux/skbuff.h:1468
Modules linked in:
NIP: c0219bf8 LR: c01abaec CTR: c01aba74
REGS: ed6dbcf0 TRAP: 0700   Not tainted  (3.0.57-rt82)
MSR: 00029032 EE,ME,CE,IR,DR  CR: 42048044  XER: 
TASK = ed7afb60[3120] 'irq/20-eth2' THREAD: ed6da000
GPR00: 0001 ed6dbda0 ed7afb60 ed6c7800  0001 ed05e000 3b9ac9ff
GPR08: ed7afb60 c035 ed6dbce0 c03d2748 42048044 1001aa90 ed6dbe78 c0352c54
GPR16: c03f ed05e520 ed05e000 05f4  ef047000 ef047060 05f2
GPR24: ef078320 ef078320 00ba 00bc f3241740 ed05e3a0 ed6c7800 0001
NIP [c0219bf8] skb_trim+0x18/0x34
LR [c01abaec] e1000_alloc_rx_buffers+0x78/0x374
Call Trace:
[ed6dbda0] [ef078320] 0xef078320 (unreliable)
[ed6dbdf0] [c01ab714] e1000_clean_rx_irq+0x35c/0x3ac
[ed6dbe60] [c01ac2cc] e1000_clean+0x340/0x4ec
[ed6dbec0] [c022799c] net_rx_action+0xc4/0x208
[ed6dbef0] [c0023410] __do_softirq_common+0xa4/0x13c
[ed6dbf30] [c0023adc] local_bh_enable+0x88/0xe8
[ed6dbf50] [c0059acc] irq_forced_thread_fn+0x5c/0x74
[ed6dbf70] [c005a954] irq_thread+0xe4/0x1ec
[ed6dbfa0] [c0038ce4] kthread+0x78/0x7c
[ed6dbff0] [c000d608] kernel_thread+0x4c/0x68
Instruction dump:
7c1d492e 80010024 bba10014 38210020 7c0803a6 4e800020 8003004c 7f802040
4c9d0020 80030050 2f80 41be000c 0fe0 4e800020 800300a4 9083004c
---[ end trace 0002 ]---

Followed again by a bad paging request.

I'm still at a loss to discover who is doing this corruption.

Pete

--
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired

Re: [E1000-devel] Memory Corruption with e1000

2013-06-05 Thread Peter LaDow

On 6/5/13, Ronciak, John john.ronc...@intel.com wrote:
 So I have a couple of questions.  Does this happen with a non-preemptive
 kernel?  I understand that you probably need to use a preemptive kernel but
 for testing purposes it would be good to know.  We don't always test with
 preemptive kernels.

Hmmm... If you mean no RT patches, then yes. On a vanilla 3.0.80 kernel.

 When doing the up/down transitions is there system under test?  I mean
 sending and receiving packets?  If it is what is the load like?  Does
 changing the load make a difference?  Does stopping the network traffic
 first make a difference in the outcome?

Yes, the load makes a difference. On a silent network (or no link at
all) this does not occur. Our network is quite busy. It isn't sending
much (perhaps DHCP discovers and some IPv6 stuff).

Thanks,
Pete

--
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired

Re: [E1000-devel] Memory Corruption with e1000

2013-06-05 Thread Peter LaDow

Quick followup. What I meant by not sending much is the adapter, not
the network. The network is very busy. However, there is hardly any
outgoing traffic from the box.

On 6/5/13, Peter LaDow pet...@gocougs.wsu.edu wrote:
 On 6/5/13, Ronciak, John john.ronc...@intel.com wrote:
 So I have a couple of questions.  Does this happen with a non-preemptive
 kernel?  I understand that you probably need to use a preemptive kernel
 but
 for testing purposes it would be good to know.  We don't always test with
 preemptive kernels.

 Hmmm... If you mean no RT patches, then yes. On a vanilla 3.0.80 kernel.

 When doing the up/down transitions is there system under test?  I mean
 sending and receiving packets?  If it is what is the load like?  Does
 changing the load make a difference?  Does stopping the network traffic
 first make a difference in the outcome?

 Yes, the load makes a difference. On a silent network (or no link at
 all) this does not occur. Our network is quite busy. It isn't sending
 much (perhaps DHCP discovers and some IPv6 stuff).

 Thanks,
 Pete


--
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired

Re: [E1000-devel] Memory Corruption with e1000

Re: [E1000-devel] Memory Corruption with e1000

Re: [E1000-devel] Memory Corruption with e1000

Re: [E1000-devel] Memory Corruption with e1000

Re: [E1000-devel] Memory Corruption with e1000

Re: [E1000-devel] Memory Corruption with e1000

Re: [E1000-devel] Memory Corruption with e1000

Re: [E1000-devel] Memory Corruption with e1000

Re: [E1000-devel] Memory Corruption with e1000

[E1000-devel] Memory Corruption with e1000

Re: [E1000-devel] Memory Corruption with e1000

Re: [E1000-devel] Memory Corruption with e1000

Re: [E1000-devel] Memory Corruption with e1000

13 matches

Site Navigation

Mail list logo

Footer information