Hi!

These patches implement the (S)RCU based proposal to optimize uprobes.

On my c^Htrusty old IVB-EP -- where each (of the 40) CPU calls 'func' in a
tight loop:

  perf probe -x ./uprobes test=func
  perf stat -ae probe_uprobe:test  -- sleep 1

  perf probe -x ./uprobes test=func%return
  perf stat -ae probe_uprobe:test__return -- sleep 1

PRE:

  4,038,804      probe_uprobe:test
  2,356,275      probe_uprobe:test__return

POST:

  7,216,579      probe_uprobe:test
  6,744,786      probe_uprobe:test__return

(copy-paste FTW, I didn't do new numbers because the fast paths didn't change --
 and quick test run shows similar numbers)

Patches also available here:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/uprobes


Changes since last time:
 - better split with intermediate inc_not_zero()
 - fix UPROBE_HANDLER_REMOVE
 - restored the lost rcu_assign_pointer()
 - avoid lockdep for uretprobe_srcu
 - add missing put_uprobe() -> srcu_read_unlock() conversion
 - actually initialize return_instance::has_ref
 - a few comments
 - things I don't remember



Reply via email to