From: Jiri Olsa <jo...@redhat.com> Date: Tue, 6 Nov 2018 21:42:55 +0100
> I pushed that fix in perf/fixes branch, but I'm still occasionaly > hitting the namespace crash.. working on it ;-) Jiri, how can this new scheme work without setting copy_on_queue for the queued_events we use here? I don't see copy_on_queue being set and that means the queued event structures reference the event memory directly in the mmaps, after the mmap thread has released them back to the queue. That means new events can come in to the mmap ring and overwrite what was there previously, maybe even while deliver_event() is in the middle of parsing the event. Setting copy_on_queue for data[0] and data[1] makes all of the crashes go away for me. I get a lot of "[unknown]" shared objects shortly after perf top starts up during a full workload. I've been wondering about one side effect of how the mmap queues are processed, consider the following: cpu 0 cpu 1 exec create new mmap2 events scheduled to cpu 0 for whatever reason sample 1 sample 2 And let's say that perf top is backlogged processing the mmap ring of events generated for cpu 0, and sees sample 1 and sample 2 before getting to any of cpu 1's events. This means the thread and map and symbol objects won't exist and we'll get those '[Unknown]' histogram entries, and they won't go away. When it finally stops looping over the mmap ring for cpu 0's events it gets to cpu 1's mmap ring and sees the exec and mmap2 events but at that point it's far too late. I surmise from what I see with perf top right now that this happens a lot.