I see, that make sense. Thanks for the dyninst pointer, I'll take a look at that in the meantime.
Regards, Frederik On Mon, Jul 17, 2017 at 11:19 PM, Y Song <[email protected]> wrote: > Frederik, > > The issue you are facing on prohibitive uprobe performance (even for > sampling) has been discussed in this forum. The idea is to adopt maybe > similar mechanism like dyninst or kerninst. We (me, Brenden, and maybe > others) are starting to exploring in this space. > > Yonghong > > On Mon, Jul 17, 2017 at 4:37 PM, Frederik Deweerdt via iovisor-dev > <[email protected]> wrote: >> Hello, >> >> I'm trying to use uprobes in order to be able to keep track of how >> long it takes to execute a given function in a shared library used by >> tens of thousands of threads on a 32 core machine. Unfortunately, i'm >> seeing a 3x slowdown when setting the uprobes (even ones that only do >> a `return 0;` in the body). >> >> Looking at a perf record, it looks like lock contention is the >> culprit: queued_spin_lock_slowpath accounts for more than 20% of the >> workload. >> >> I'm wondering if there's a way around that, for my use case sampling >> would be an option, but I haven't found a way to do so without >> actually entering the probe (which has a prohibitive cost by itself). >> >> Any thoughts? >> Frederik >> _______________________________________________ >> iovisor-dev mailing list >> [email protected] >> https://lists.iovisor.org/mailman/listinfo/iovisor-dev _______________________________________________ iovisor-dev mailing list [email protected] https://lists.iovisor.org/mailman/listinfo/iovisor-dev
