Re: [iovisor-dev] Hashmap OOPS

Daniel Borkmann Wed, 17 Oct 2018 07:18:20 -0700

On 10/17/2018 03:48 PM, Dan Siemon wrote:
> On Wed, 2018-10-17 at 14:08 +0200, Daniel Borkmann wrote:
>> Hi Dan,
>>
>> On 10/17/2018 03:08 AM, Dan Siemon wrote:
>>> Anyone seen this before? 4.17.6
>>>
>>> I've seen this on a few boxes now.
>>>
>>> (If there is a better place to post this (LKML?) please let me
>>> know.)
>>>
>>> Oct 08 21:21:02  kernel: BUG: unable to handle kernel paging
>>> request at
>>> ffffc48f1f42d500
>>> Oct 08 21:21:02  kernel: PGD 0 P4D 0 
>>> Oct 08 21:21:02  kernel: Oops: 0000 [#1] SMP PTI
>>> Oct 08 21:21:02  kernel: Modules linked in: sch_htb bpwd_drv(OE)
>>> lanner_drv(OE) cls_bpf sch_ingress ip6t_rpfilter ip6t_REJECT
>>> nf_reject_ipv6 xt_conntrack ip_set nf>
>>> Oct 08 21:21:02  kernel:  crc32c_intel igb i2c_algo_bit dca
>>> Oct 08 21:21:02  kernel: CPU: 5 PID: 27247 Comm: preseem-netdev-
>>> Tainted: G           OE     4.17.6-200.fc28.x86_64 #1
>>> Oct 08 21:21:02  kernel: Hardware name: Default string Default
>>> string/SKYBAY, BIOS 5.12 08/15/2017
>>> Oct 08 21:21:02  kernel: RIP: 0010:memcpy_erms+0x6/0x10
>>> Oct 08 21:21:02  kernel: RSP: 0018:ffffbc8ec1f37e40 EFLAGS:
>>> 00010286
>>> Oct 08 21:21:02  kernel: RAX: ffff9eca25ae0ab0 RBX:
>>> ffff9eca25ae0ab0
>>> RCX: 0000000000000010
>>> Oct 08 21:21:02  kernel: RDX: 0000000000000010 RSI:
>>> ffffc48f1f42d500
>>> RDI: ffff9eca25ae0ab0
>>> Oct 08 21:21:02  kernel: RBP: 0000000000000010 R08:
>>> 0000000000400000
>>> R09: 0000000000000001
>>> Oct 08 21:21:02  kernel: R10: 000000002e523fcb R11:
>>> 0000000000000000
>>> R12: ffff9ec9eb450600
>>> Oct 08 21:21:02  kernel: R13: 0000000000141ca6 R14:
>>> ffff9eca464a3700
>>> R15: ffff9eca25ae0ed0
>>> Oct 08 21:21:02  kernel: FS:  00007f3262ffd700(0000)
>>> GS:ffff9eca9ed40000(0000) knlGS:0000000000000000
>>> Oct 08 21:21:02  kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
>>> 0000000080050033
>>> Oct 08 21:21:02  kernel: CR2: ffffc48f1f42d500 CR3:
>>> 0000000422b82003
>>> CR4: 00000000003606e0
>>> Oct 08 21:21:02  kernel: DR0: 0000000000000000 DR1:
>>> 0000000000000000
>>> DR2: 0000000000000000
>>> Oct 08 21:21:02  kernel: DR3: 0000000000000000 DR6:
>>> 00000000fffe0ff0
>>> DR7: 0000000000000400
>>> Oct 08 21:21:02  kernel: Call Trace:
>>> Oct 08 21:21:02  kernel:  htab_map_get_next_key+0x237/0x280
>>> Oct 08 21:21:02  kernel:  map_get_next_key+0xed/0x1f0
>>> Oct 08 21:21:02  kernel:  __x64_sys_bpf+0x16b/0x260
>>> Oct 08 21:21:02  kernel:  do_syscall_64+0x5b/0x160
>>> Oct 08 21:21:02  kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>> Oct 08 21:21:02  kernel: RIP: 0033:0x47fa54
>>> Oct 08 21:21:02  kernel: RSP: 002b:000000c428409278 EFLAGS:
>>> 00000246
>>> ORIG_RAX: 0000000000000141
>>> Oct 08 21:21:02  kernel: RAX: ffffffffffffffda RBX:
>>> 0000000000000000
>>> RCX: 000000000047fa54
>>> Oct 08 21:21:02  kernel: RDX: 0000000000000020 RSI:
>>> 000000c428409300
>>> RDI: 0000000000000004
>>> Oct 08 21:21:02  kernel: RBP: 000000c4284093a8 R08:
>>> 0000000000000000
>>> R09: 0000000000000000
>>> Oct 08 21:21:02  kernel: R10: 0000000000000000 R11:
>>> 0000000000000246
>>> R12: 0000000000001e00
>>> Oct 08 21:21:02  kernel: R13: 0000000000000000 R14:
>>> 000000000045d560
>>> R15: 0000000000000000
>>> Oct 08 21:21:02  kernel: Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89
>>> f8
>>> 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44
>>> 00 00
>>> 48 89 f8 48 89 d1 <f3>
>>> Oct 08 21:21:02  kernel: RIP: memcpy_erms+0x6/0x10 RSP:
>>> ffffbc8ec1f37e40
>>> Oct 08 21:21:02  kernel: CR2: ffffc48f1f42d500
>>> Oct 08 21:21:02  kernel: ---[ end trace ebebe46b9c6091a1 ]---
>>> -- Reboot --
>>
>> Thanks a lot for the report! Do you happen to have a reproducer? Are
>> you
>> running into this with concurrent i) updates, ii) deletes, iii) both
>> or
>> iv) while destroying the whole htab? If i-iii) is it out of BPF
>> datapath
>> or from syscall path? Is your hashtab preallocated? Any more
>> elaborate info
>> would help.
> 
> Sadly I don't have a reproducer other than I've seen it happen on
> several distinct boxes and deployments.
> 
> We use several hash maps and I don't know how to tell from the OOPs
> which one caused this but the two largest (100-600k entries) work like:
> 
> - eBPF program creates entries as it sees flows on the network and
> counts metrics (bytes, packets, etc) into each entry and updates a
> timestamp on each update


Does that happen via map update helper from kernel side in your prog?

> - User space iterates over the map every 10s and pulls out the metrics
> for further processing. If the timestamp on the entry is over a
> threshold, the user space process deletes the entry (kernel side never
> deletes)

And the map iterate + delete, are there multiple threads walking it
at the same time or just single one (just making sure ...)?

> bpftool map output:
> 
> 22: hash  flags 0x0

Ok, you are using map prealloc here. How often does the BUG trigger
in general on your machines? Was there an older kernel where you
haven't been able to trigger it?

>       key 12B  value 8B  max_entries 1213  memlock 122880B
> 23: hash  flags 0x0
>       key 16B  value 256B  max_entries 2102717  memlock 739983360B
> 24: hash  flags 0x0
>       key 12B  value 2B  max_entries 2102717  memlock 218505216B
> 25: hash  flags 0x0
>       key 20B  value 48B  max_entries 2102717  memlock 319438848B
> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#1500): https://lists.iovisor.org/g/iovisor-dev/message/1500
Mute This Topic: https://lists.iovisor.org/mt/27375713/21656
Group Owner: [email protected]
Unsubscribe: https://lists.iovisor.org/g/iovisor-dev/unsub  
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [iovisor-dev] Hashmap OOPS

Reply via email to