My 2c for Mali/Panfrost --

For us, capturing GPU perf counters is orthogonal to rendering. It's
expected (e.g. with Arm's tools) to do this from a separate process.
Neither Mesa nor the DDK should require custom instrumentation for the
low-level data. Fahien's gfx-pps handles this correctly for Panfrost +
Perfetto as it is. So for us I don't see the value in modifying Mesa for
tracing.

On Fri, Feb 12, 2021 at 01:34:51PM -0800, John Bates wrote:
> (responding from correct address this time)
> 
> On Fri, Feb 12, 2021 at 12:03 PM Mark Janes <mark.a.ja...@intel.com> wrote:
> 
> > I've recently been using GPUVis to look at trace events.  On Intel
> > platforms, GPUVis incorporates ftrace events from the i915 driver,
> > performance metrics from igt-gpu-tools, and userspace ftrace markers
> > that I locally hack up in Mesa.
> >
> 
> GPUVis is great. I would love to see that data combined with
> userspace events without any need for local hacks. Perfetto provides
> on-demand trace events with lower overhead compared to ftrace, so for
> example it is acceptable to have production trace instrumentation that can
> be captured without dev builds. To do that with ftrace it may require a way
> to enable and disable the ftrace file writes to avoid the overhead when
> tracing is not in use. This is what Android does with systrace/atrace, for
> example, it uses Binder to notify processes about trace sessions. Perfetto
> does that in a more portable way.
> 
> 
> >
> > It is very easy to compile the GPUVis UI.  Userspace instrumentation
> > requires a single C/C++ header.  You don't have to access an external
> > web service to analyze trace data (a big no-no for devs working on
> > preproduction hardware).
> >
> > Is it possible to build and run the Perfetto UI locally?
> 
> 
> Yes, local UI builds are possible
> <https://github.com/google/perfetto/blob/5ff758df67da94d17734c2e70eb6738c4902953e/ui/README.md>.
> Also confirmed with the perfetto team <https://discord.gg/35ShE3A> that
> trace data is not uploaded unless you use the 'share' feature.
> 
> 
> >   Can it display
> > arbitrary trace events that are written to
> > /sys/kernel/tracing/trace_marker ?
> 
> 
> Yes, I believe it does support that via linux.ftrace data source
> <https://perfetto.dev/docs/quickstart/linux-tracing>. We use that for
> example to overlay CPU sched data to show what process is on each core
> throughout the timeline. There are many ftrace event types
> <https://github.com/google/perfetto/tree/5ff758df67da94d17734c2e70eb6738c4902953e/protos/perfetto/trace/ftrace>
> in
> the perfetto protos.
> 
> 
> > Can it be extended to show i915 and
> > i915-perf-recorder events?
> >
> 
> It can be extended to consume custom data sources. One way this is done is
> via a bridge daemon, such as traced_probes which is responsible for
> capturing data from ftrace and /proc during a trace session and sending it
> to traced. traced is the main perfetto tracing daemon that notifies all
> trace data sources to start/stop tracing and communicates with user tracing
> requests via the 'perfetto' command.
> 
> 
> 
> >
> > John Bates <jba...@chromium.org> writes:
> >
> > > I recently opened issue 4262
> > > <https://gitlab.freedesktop.org/mesa/mesa/-/issues/4262> to begin the
> > > discussion on integrating perfetto into mesa.
> > >
> > > *Background*
> > >
> > > System-wide tracing is an invaluable tool for developers to find and fix
> > > performance problems. The perfetto project enables a combined view of
> > trace
> > > data from kernel ftrace, GPU driver and various manually-instrumented
> > > tracepoints throughout the application and system. This helps developers
> > > quickly answer questions like:
> > >
> > >    - How long are frames taking?
> > >    - What caused a particular frame drop?
> > >    - Is it CPU bound or GPU bound?
> > >    - Did a CPU core frequency drop cause something to go slower than
> > usual?
> > >    - Is something else running that is stealing CPU or GPU time? Could I
> > >    fix that with better thread/context priorities?
> > >    - Are all CPU cores being used effectively? Do I need
> > sched_setaffinity
> > >    to keep my thread on a big or little core?
> > >    - What’s the latency between CPU frame submit and GPU start?
> > >
> > > *What Does Mesa + Perfetto Provide?*
> > >
> > > Mesa is in a unique position to produce GPU trace data for several GPU
> > > vendors without requiring the developer to build and install additional
> > > tools like gfx-pps <https://gitlab.freedesktop.org/Fahien/gfx-pps>.
> > >
> > > The key is making it easy for developers to use. Ideally, perfetto is
> > > eventually available by default in mesa so that if your system has
> > perfetto
> > > traced running, you just need to run perfetto (perhaps along with setting
> > > an environment variable) with the mesa categories to see:
> > >
> > >    - GPU processing timeline events.
> > >    - GPU counters.
> > >    - CPU events for potentially slow functions in mesa like shader
> > compiles.
> > >
> > > Example of what this data might look like (with fake GPU events):
> > > [image: percetto-gpu-example.png]
> > >
> > > *Runtime Characteristics*
> > >
> > >    - ~500KB additional binary size. Even with using only the basic
> > features
> > >    of perfetto, it will increase the binary size of mesa by about 500KB.
> > >    - Background thread. Perfetto uses a background thread for
> > communication
> > >    with the system tracing daemon (traced) to advertise trace data and
> > get
> > >    notification of trace start/stop.
> > >    - Runtime overhead when disabled is designed to be optimal with one
> > >    predicted branch, typically a few CPU cycles
> > >    <https://perfetto.dev/docs/instrumentation/track-events#performance>
> > per
> > >    event. While enabled, the overhead can be around 1 us per event.
> > >
> > > *Integration Challenges*
> > >
> > >    - The perfetto SDK is C++ and designed around macros, lambdas, inline
> > >    templates, etc. There are ongoing discussions on providing an official
> > >    perfetto C API, but it is not yet clear when this will land on the
> > perfetto
> > >    roadmap.
> > >    - The perfetto SDK is an amalgamated .h and .cc that adds up to 100K
> > >    lines of code.
> > >    - Anything that includes perfetto.h takes a long time to compile.
> > >    - The current Perfetto SDK design is incompatible with being a shared
> > >    library behind a C API.
> > >
> > > *Percetto*
> > >
> > > The percetto library <https://github.com/olvaffe/percetto> was recently
> > > implemented to provide an interim C API for perfetto. It provides
> > efficient
> > > support for scoped trace events, multiple categories, counters, custom
> > > timestamps, and debug data annotations. Percetto also provides some
> > > features that are important to mesa, but not available yet with perfetto
> > > SDK:
> > >
> > >    - Trace events from multiple perfetto instances in separate shared
> > >    libraries (like mesa and virglrenderer) show correctly in a single
> > process
> > >    and thread view.
> > >    - Counter tracks and macro API.
> > >
> > > Percetto is missing API for perfetto's GPU DataSource and counter
> > support,
> > > but that feature could be implemented next if it is important for mesa.
> > > With the existing percetto API mesa could present GPU trace data as named
> > > 'slice' events and int64_t counters with custom timestamps as shown in
> > the
> > > image above (based on this sample
> > > <https://github.com/olvaffe/percetto/blob/main/examples/timestamps.c>).
> > >
> > > *Mesa Integration Alternatives*
> > >
> > > Note: we have some pressing needs for performance analysis in Chrome OS,
> > so
> > > I'm intentionally leaving out the alternative of waiting for an official
> > > perfetto C API. Of course, once that C API is available it would become
> > an
> > > option to migrate to it from any of the alternatives below.
> > >
> > > Ordered by difficulty with easiest first:
> > >
> > >    1. Statically link with percetto as an optional external dependency
> > > (virglrenderer
> > >    now has this approach
> > >    <
> > https://gitlab.freedesktop.org/virgl/virglrenderer/-/merge_requests/480>
> > >    ).
> > >    - Pros: API already supports most common tracing needs. Tested and
> > used
> > >       by an increasing number of CrOS components.
> > >       - Cons: External dependency for optional mesa build option.
> > >    2. Embed Perfetto SDK + a Percetto fork/copy.
> > >       - Pros: API already supports most common tracing needs. No added
> > >       external dependency for mesa.
> > >       - Cons: Percetto code divergence, bug fixes need to land in two
> > trees.
> > >    3. Embed Perfetto SDK + custom C wrapper.
> > >       - Pros: Tailored API for mesa's needs.
> > >       - Cons: Nontrivial development efforts and maintenance.
> > >    4. Generate C stubs for the Perfetto protobuf and reimplement the
> > >    Perfetto SDK in C.
> > >       - Pros: Tailored API for mesa's needs. Possible smaller binary
> > impact
> > >       from simpler implementation.
> > >       - Cons: Significant development efforts and maintenance.
> > >
> > > Regardless of the integration direction, I expect we would disable
> > perfetto
> > > in the default build for now to minimize disruption.
> > >
> > > I like #1, because there are some nontrivial subtleties to the C wrapper
> > > that provide both API conveniences and runtime performance that would
> > need
> > > to be reimplemented or maintained with the other options. I will also
> > > volunteer to do #1 or #2, but I'm not sure I have time for #3 or #4 :D.
> > >
> > > Any other thoughts on how best to integrate perfetto into mesa?
> > >
> > > -jb
> > > _______________________________________________
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >

> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to