Hey all, while I got used to using perf report from the CLI with it's quirky interface, I always wanted to open perf results in kcachegrind. There are some converters out there, but all of them seem to be quite complicated, and none are shipped with perf itself.
Since I wanted to play with the python bindings anyways, I decided to write a converter there. It's currently at less than 100 lines of code and seems to work reasonably well. Usage: perf record ... perf script -s ./perf-callgrind.py > callgrind.out kcachegrind callgrind.out Screenshot: http://imgur.com/QHttohs Caveats: - I did not do extensive tests - addr2line using event/sym/begin isn't implemented yet. I have something locally already, but it breaks the callgraph, i.e. I do something wrong. I'll fix this and then send this file in as a proper patch to be included in perf itself. - missing support for multiple events in a single Future work: - extend the python binding to get access to the header data, esp. the command line - estimate the callcount. VTune does this as well, if anyone has some papers or input on that topic I'd be interested. Essentially, my idea currently is to look at how the callgraph changes. If the leaf is exited and then reenters a known function, we can be sure it was called again Feedback welcome! -- Milian Wolff | milian.wo...@kdab.com | Software Engineer KDAB (Deutschland) GmbH&Co KG, a KDAB Group company Tel: +49-30-521325470 KDAB - The Qt Experts
# perf script event handlers, generated by perf script -g python
# Licensed under the terms of the GNU GPL License version 2
# The common_* event handler fields are the most useful fields common to
# all events. They don't necessarily correspond to the 'common_*' fields
# in the format files. Those fields not available as handler params can
# be retrieved using Python functions of the form common_*(context).
# See the perf-trace-python Documentation for the list of available functions.
import os
import sys
import json
from collections import defaultdict
sys.path.append(os.environ["PERF_EXEC_PATH"] + "/scripts/python/Perf-Trace-Util/lib/Perf/Trace")
from perf_trace_context import *
class Function:
def __init__(self, dso, name, sym):
self.cost = 0
self.dso = dso
self.name = name
self.sym = sym
self.callees = defaultdict(lambda: 0)
class DSO:
def __init__(self):
self.functions = dict()
# a map of all encountered dso's and the functions therein
# this is done to prevent name clashes
dsos = defaultdict(lambda: DSO())
def addFunction(dsoName, name, sym):
global dsos
dso = dsos[dsoName]
function = dso.functions.get(name, None)
# create function if it's not yet known
if not function:
function = Function(dsoName, name, sym)
dso.functions[name] = function
return function
# write the callgrind file format to stdout
def trace_end():
global dsos
print("version: 1")
print("creator: perf-callgrind 0.1")
print("part: 1")
# TODO: get access to command line, it's in the perf data header
# but not accessible to the scripting backend, is it?
print("events: Samples")
for name, dso in dsos.iteritems():
print("ob=%s" % name)
for sym, function in dso.functions.iteritems():
print("fn=%s" % sym)
print("0 %d" % function.cost)
for callee, cost in function.callees.iteritems():
print("cob=%s" % callee.dso)
print("cfn=%s" % callee.name)
print("calls=1 0")
print("0 %d" % cost)
print("")
def process_event(event):
caller = None
if not event["callchain"]:
# only add the single symbol where we got the sample, without a backtrace
dsoName = event.get("dso", "???")
name = event.get("symbol", "???")
caller = addFunction(dsoName, name, None)
else:
# add a function for every frame in the callchain
for item in reversed(event["callchain"]):
dsoName = item.get("dso", "???")
name = "???"
if "sym" in item:
name = item["sym"]["name"]
function = addFunction(dsoName, name, item.get("sym", None))
# add current frame to parent's callee list
if caller is not None:
caller.callees[function] += 1
caller = function
# increase the self cost of the last frame
# all other frames include it now and kcachegrind will automatically
# take care of adapting their inclusive cost
if caller is not None:
caller.cost += 1
smime.p7s
Description: S/MIME cryptographic signature
