On 06/26/2017 04:56 AM, 王志克 wrote:
Hi Joe,
I will try to check how to send the patch. Maybe tomorrow since I am quite busy
now.
Regarding the crash, I can reproduce it even with official OVS, like ovs2.6.0.
(I just run the check kmod in a loop until kernel panic). So it is not related
to the new fix.
Br,
Wang Zhike
I've been running 'make check-kmod' in a continuous loop on 3 virtual machines
since this morning. So far no kernel splats but plenty of errors:
This is on the Ubuntu machine running 4.0 kernel:
ERROR: 66 tests were run,
24 failed unexpectedly.
23 tests were skipped.
## -------------------------------------- ##
## system-kmod-testsuite.log was created. ##
## -------------------------------------- ##
Please send `tests/system-kmod-testsuite.log' and all information you think
might help:
To: <[email protected]>
Subject: [openvswitch 2.7.90] system-kmod-testsuite: 16 17 35 57 58 59 60
61 62 63 70 71 72 75 76 81 82 83 84 85 86 87 88 89 failed
Centos 7.2 running 4.9.24 kernel:
## ------------- ##
## Test results. ##
## ------------- ##
ERROR: 76 tests were run,
34 failed unexpectedly.
13 tests were skipped.
## -------------------------------------- ##
## system-kmod-testsuite.log was created. ##
## -------------------------------------- ##
Please send `tests/system-kmod-testsuite.log' and all information you think
might help:
To: <[email protected]>
Subject: [openvswitch 2.7.90] system-kmod-testsuite: 2 14 15 20 21 22 23
24 25 26 27 28 29 30 31 32 47 48 49 50 51 57 59 60 61 62 70 71 75 76 84 85 86
87 failed
Centos 7.2 running 4.10.17 kernel:
## ------------- ##
## Test results. ##
## ------------- ##
ERROR: 74 tests were run,
34 failed unexpectedly.
15 tests were skipped.
## -------------------------------------- ##
## system-kmod-testsuite.log was created. ##
## -------------------------------------- ##
Please send `tests/system-kmod-testsuite.log' and all information you think
might help:
To: <[email protected]>
Subject: [openvswitch 2.7.90] system-kmod-testsuite: 2 14 15 20 21 22 23
24 25 26 27 28 29 30 31 32 47 48 49 50 51 57 59 60 61 62 70 71 75 76 84 85 86
87 failed
I confess to not spending a lot of time running check-kmod. I certainly intend
to in the future.
- Greg
-----邮件原件-----
发件人: Joe Stringer [mailto:[email protected]]
发送时间: 2017年6月24日 5:15
收件人: 王志克
抄送: [email protected]
主题: Re: 答复: [ovs-dev] [PATCH] pkt reassemble: fix kernel panic for ovs
reassemble
Hi Wang Zhike,
I'd like if others like Greg could take a look as well, since this code is
delicate. The more review it gets, the better. It seems like maybe the version
of your email that goes to the list does not get the attachment. Perhaps you
could try sending the patch using git send-email or putting the patch on GitHub
instead, and linking to it here.
For what it's worth, I did run your patch for a while and it seemed OK, but
when I tried again today on an Ubuntu Trusty (Linux
3.13.0-119-generic) box, running make check-kmod, I saw an issue with
get_next_timer_interrupt():
[181250.892557] BUG: unable to handle kernel paging request at ffffffffa03317e0
[181250.892557] IP: [<ffffffff81079606>] get_next_timer_interrupt+0x86/0x250
[181250.892557] PGD 1c11067 PUD 1c12063 PMD 1381a2067 PTE 0 [181250.892557]
Oops: 0000 [#1] SMP [181250.892557] Modules linked in: nf_nat_ipv6 nf_nat_ipv4
nf_nat
gre(-) nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv6
nf_defrag_ipv4 nf_conntrack_netlink nfnetlink nf_conntrack bonding 8021q garp
stp mrp llc veth nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache dm_crypt
kvm_intel kvm serio_raw netconsole configfs crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel aesni_intel
aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse floppy ahci
libahci [last unloaded: libcrc32c]
[181250.892557] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OX
3.13.0-119-generic #166-Ubuntu
[181250.892557] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Bochs 01/01/2011 [181250.892557] task: ffffffff81c15480 ti: ffffffff81c00000
task.ti:
ffffffff81c00000
[181250.892557] RIP: 0010:[<ffffffff81079606>] [<ffffffff81079606>]
get_next_timer_interrupt+0x86/0x250
[181250.892557] RSP: 0018:ffffffff81c01e00 EFLAGS: 00010002 [181250.892557]
RAX: ffffffffa03317c8 RBX: 0000000102b245da RCX:
00000000000000db
[181250.892557] RDX: ffffffff81ebac58 RSI: 00000000000000db RDI:
0000000102b245db
[181250.892557] RBP: ffffffff81c01e48 R08: 0000000000c88c1c R09:
0000000000000000
[181250.892557] R10: 0000000000000000 R11: 0000000000000000 R12:
0000000142b245d9
[181250.892557] R13: ffffffff81eb9e80 R14: 0000000102b245da R15:
0000000000cd63e8
[181250.892557] FS: 0000000000000000(0000) GS:ffff88013fc00000(0000)
knlGS:0000000000000000
[181250.892557] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[181250.892557] CR2: ffffffffa03317e0 CR3: 000000003707f000 CR4:
00000000000006f0
[181250.892557] Stack:
[181250.892557] 0000000000000000 ffffffff81c01e30 ffffffff810a3af5
ffff88013fc13bc0
[181250.892557] ffff88013fc0dce0 0000000102b245da 0000000000000000
00000063ae154000
[181250.892557] 0000000000cd63e8 ffffffff81c01ea8 ffffffff810da655
0000a4d8c2cb6200
[181250.892557] Call Trace:
[181250.892557] [<ffffffff810a3af5>] ? set_next_entity+0x95/0xb0 [181250.892557]
[<ffffffff810da655>] tick_nohz_stop_sched_tick+0x1e5/0x340
[181250.892557] [<ffffffff810da851>] __tick_nohz_idle_enter+0xa1/0x160 [181250.892557] [<ffffffff810dab4d>]
tick_nohz_idle_enter+0x3d/0x70 [181250.892557] [<ffffffff810c2af7>] cpu_startup_entry+0x87/0x2b0 [181250.892557]
[<ffffffff8171b387>] rest_init+0x77/0x80 [181250.892557] [<ffffffff81d34f6a>] start_kernel+0x432/0x43d
[181250.892557] [<ffffffff81d34941>] ? repair_env_string+0x5c/0x5c [181250.892557] [<ffffffff81d34120>] ?
early_idt_handler_array+0x120/0x120
[181250.892557] [<ffffffff81d345ee>] x86_64_start_reservations+0x2a/0x2c
[181250.892557] [<ffffffff81d34733>] x86_64_start_kernel+0x143/0x152
[181250.892557] Code: 8b 7d 10 4d 8b 75 18 4c 39 f7 78 5c 40 0f b6 cf
89 ce 48 63 c6 48 c1 e0 04 49 8d 54 05 00 48 8b 42 28 48 83 c2 28 48
39 d0 74 0e <f6> 40 18 01 74 24 48 8b 00 48 39 d0 75 f2 83 c6 01 40 0f
b6 f6
[181250.892557] RIP [<ffffffff81079606>] get_next_timer_interrupt+0x86/0x250
[181250.892557] RSP <ffffffff81c01e00>
[181250.892557] CR2: ffffffffa03317e0
It seems like perhaps a fragment timer signed up by OVS is still remaining when
the OVS module is unloaded, so it may attempt to clean up an entry using OVS
code but the OVS code has been unloaded at that point. This might be related to
IPv6 cvlan test - that seems to be where my VM froze and went to 100% CPU, but
I would think that the
IPv6 fragmentation cleanup test is a more likely to cause this, since it leaves
fragments behind in the cache after the test finishes. I've only hit this when
running all of the tests in make check-kmod.
Cheers,
Joe
On 22 June 2017 at 17:53, 王志克 <[email protected]> wrote:
> Hi Joe,
>
> Please check the attachment. Thanks.
>
> Br,
> Wang Zhike
>
> -----邮件原件-----
> 发件人: Joe Stringer [mailto:[email protected]]
> 发送时间: 2017年6月23日 8:20
> 收件人: 王志克
> 抄送: [email protected]
> 主题: Re: [ovs-dev] [PATCH] pkt reassemble: fix kernel panic for ovs
> reassemble
>
> On 21 June 2017 at 18:54, 王志克 <[email protected]> wrote:
>> Ovs and kernel stack would add frag_queue to same netns_frags list.
>> As result, ovs and kernel may access the fraq_queue without correct
>> lock. Also the struct ipq may be different on kernel(older than 4.3),
>> which leads to invalid pointer access.
>>
>> The fix creates specific netns_frags for ovs.
>>
>> Signed-off-by: wangzhike <[email protected]>
>> ---
>
> Hi,
>
> It looks like the whitespace has been corrupted in this version of the patch
that you sent, I cannot apply it. Probably your email client mistreats it when
sending the email out. A reliable method to send patches correctly via email is to
use the commandline client 'git send-email'. This is the preferred method. If you
are unable to set that up, consider attaching the patch to the email (or send a
pull request on GitHub).
>
> Cheers,
> Joe
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev