On Wed, 2018-10-17 at 14:08 +0200, Daniel Borkmann wrote:
> Hi Dan,
>
> On 10/17/2018 03:08 AM, Dan Siemon wrote:
> > Anyone seen this before? 4.17.6
> >
> > I've seen this on a few boxes now.
> >
> > (If there is a better place to post this (LKML?) please let me
> > know.)
> >
> > Oct 08 21:21:02 kernel: BUG: unable to handle kernel paging
> > request at
> > ffffc48f1f42d500
> > Oct 08 21:21:02 kernel: PGD 0 P4D 0
> > Oct 08 21:21:02 kernel: Oops: 0000 [#1] SMP PTI
> > Oct 08 21:21:02 kernel: Modules linked in: sch_htb bpwd_drv(OE)
> > lanner_drv(OE) cls_bpf sch_ingress ip6t_rpfilter ip6t_REJECT
> > nf_reject_ipv6 xt_conntrack ip_set nf>
> > Oct 08 21:21:02 kernel: crc32c_intel igb i2c_algo_bit dca
> > Oct 08 21:21:02 kernel: CPU: 5 PID: 27247 Comm: preseem-netdev-
> > Tainted: G OE 4.17.6-200.fc28.x86_64 #1
> > Oct 08 21:21:02 kernel: Hardware name: Default string Default
> > string/SKYBAY, BIOS 5.12 08/15/2017
> > Oct 08 21:21:02 kernel: RIP: 0010:memcpy_erms+0x6/0x10
> > Oct 08 21:21:02 kernel: RSP: 0018:ffffbc8ec1f37e40 EFLAGS:
> > 00010286
> > Oct 08 21:21:02 kernel: RAX: ffff9eca25ae0ab0 RBX:
> > ffff9eca25ae0ab0
> > RCX: 0000000000000010
> > Oct 08 21:21:02 kernel: RDX: 0000000000000010 RSI:
> > ffffc48f1f42d500
> > RDI: ffff9eca25ae0ab0
> > Oct 08 21:21:02 kernel: RBP: 0000000000000010 R08:
> > 0000000000400000
> > R09: 0000000000000001
> > Oct 08 21:21:02 kernel: R10: 000000002e523fcb R11:
> > 0000000000000000
> > R12: ffff9ec9eb450600
> > Oct 08 21:21:02 kernel: R13: 0000000000141ca6 R14:
> > ffff9eca464a3700
> > R15: ffff9eca25ae0ed0
> > Oct 08 21:21:02 kernel: FS: 00007f3262ffd700(0000)
> > GS:ffff9eca9ed40000(0000) knlGS:0000000000000000
> > Oct 08 21:21:02 kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
> > 0000000080050033
> > Oct 08 21:21:02 kernel: CR2: ffffc48f1f42d500 CR3:
> > 0000000422b82003
> > CR4: 00000000003606e0
> > Oct 08 21:21:02 kernel: DR0: 0000000000000000 DR1:
> > 0000000000000000
> > DR2: 0000000000000000
> > Oct 08 21:21:02 kernel: DR3: 0000000000000000 DR6:
> > 00000000fffe0ff0
> > DR7: 0000000000000400
> > Oct 08 21:21:02 kernel: Call Trace:
> > Oct 08 21:21:02 kernel: htab_map_get_next_key+0x237/0x280
> > Oct 08 21:21:02 kernel: map_get_next_key+0xed/0x1f0
> > Oct 08 21:21:02 kernel: __x64_sys_bpf+0x16b/0x260
> > Oct 08 21:21:02 kernel: do_syscall_64+0x5b/0x160
> > Oct 08 21:21:02 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > Oct 08 21:21:02 kernel: RIP: 0033:0x47fa54
> > Oct 08 21:21:02 kernel: RSP: 002b:000000c428409278 EFLAGS:
> > 00000246
> > ORIG_RAX: 0000000000000141
> > Oct 08 21:21:02 kernel: RAX: ffffffffffffffda RBX:
> > 0000000000000000
> > RCX: 000000000047fa54
> > Oct 08 21:21:02 kernel: RDX: 0000000000000020 RSI:
> > 000000c428409300
> > RDI: 0000000000000004
> > Oct 08 21:21:02 kernel: RBP: 000000c4284093a8 R08:
> > 0000000000000000
> > R09: 0000000000000000
> > Oct 08 21:21:02 kernel: R10: 0000000000000000 R11:
> > 0000000000000246
> > R12: 0000000000001e00
> > Oct 08 21:21:02 kernel: R13: 0000000000000000 R14:
> > 000000000045d560
> > R15: 0000000000000000
> > Oct 08 21:21:02 kernel: Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89
> > f8
> > 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44
> > 00 00
> > 48 89 f8 48 89 d1 <f3>
> > Oct 08 21:21:02 kernel: RIP: memcpy_erms+0x6/0x10 RSP:
> > ffffbc8ec1f37e40
> > Oct 08 21:21:02 kernel: CR2: ffffc48f1f42d500
> > Oct 08 21:21:02 kernel: ---[ end trace ebebe46b9c6091a1 ]---
> > -- Reboot --
>
> Thanks a lot for the report! Do you happen to have a reproducer? Are
> you
> running into this with concurrent i) updates, ii) deletes, iii) both
> or
> iv) while destroying the whole htab? If i-iii) is it out of BPF
> datapath
> or from syscall path? Is your hashtab preallocated? Any more
> elaborate info
> would help.
Sadly I don't have a reproducer other than I've seen it happen on
several distinct boxes and deployments.
We use several hash maps and I don't know how to tell from the OOPs
which one caused this but the two largest (100-600k entries) work like:
- eBPF program creates entries as it sees flows on the network and
counts metrics (bytes, packets, etc) into each entry and updates a
timestamp on each update
- User space iterates over the map every 10s and pulls out the metrics
for further processing. If the timestamp on the entry is over a
threshold, the user space process deletes the entry (kernel side never
deletes)
bpftool map output:
22: hash flags 0x0
key 12B value 8B max_entries 1213 memlock 122880B
23: hash flags 0x0
key 16B value 256B max_entries 2102717 memlock 739983360B
24: hash flags 0x0
key 12B value 2B max_entries 2102717 memlock 218505216B
25: hash flags 0x0
key 20B value 48B max_entries 2102717 memlock 319438848B
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1499): https://lists.iovisor.org/g/iovisor-dev/message/1499
Mute This Topic: https://lists.iovisor.org/mt/27375713/21656
Group Owner: [email protected]
Unsubscribe: https://lists.iovisor.org/g/iovisor-dev/unsub
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-