Hi! These patches implement the (S)RCU based proposal to optimize uprobes.
On my c^Htrusty old IVB-EP -- where each (of the 40) CPU calls 'func' in a tight loop: perf probe -x ./uprobes test=func perf stat -ae probe_uprobe:test -- sleep 1 perf probe -x ./uprobes test=func%return perf stat -ae probe_uprobe:test__return -- sleep 1 PRE: 4,038,804 probe_uprobe:test 2,356,275 probe_uprobe:test__return POST: 7,216,579 probe_uprobe:test 6,744,786 probe_uprobe:test__return (copy-paste FTW, I didn't do new numbers because the fast paths didn't change -- and quick test run shows similar numbers) Patches also available here: git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/uprobes Changes since last time: - better split with intermediate inc_not_zero() - fix UPROBE_HANDLER_REMOVE - restored the lost rcu_assign_pointer() - avoid lockdep for uretprobe_srcu - add missing put_uprobe() -> srcu_read_unlock() conversion - actually initialize return_instance::has_ref - a few comments - things I don't remember