Hi Team,

On the long flight back from Taipei I was brainstorming the best
possible way to leverage all of the existing debugging/perf tools to
solve some of the outstanding issues we still face (e.g. transient
events in memory, power, checkerboarding and general perf as well as
answering "why is X slow?").  I think I've come up with a design that
will allow us to get a whole lot more out of our existing tools.

Here is what I'm thinking in rough priority order:

1.  Unify event signaling for all sources of "interesting" events.

Right now we have a bunch of "interesting" perf-related events (e.g.
checkerboarding, memory pressures, zRAM caching, power consumption
spikes, slow event handling, etc) that are reported in different ways.
For instance, Ben Kelly showed us how to spot memory pressure events in
the log output.  We learned that the graphics pipeline knows when
checkerboarding occurs and could report that.  We saw how the powertool
can see when power consumption spikes and could report it.  zRAM caching
events reported from B2G.  RIL, WiFi, GPS, etc all reported from B2G in
the logs.

The first step I think we should take is figuring out a way to report
interesting events via a unified mechanism.  For events generated on the
device, I think the Profiler::addMarker is the good way to go.  Not only
would it be a single, central reporting mechanism available from both
C++ and JS, it would also get those events into the profiling capture
buffer.  The profiling capture mechanism has a small circular buffer for
storing stack traces and markers.  There's no reason we couldn't have a
thread transmitting event markers back to the host over a socket in
realtime.

For events generated off of the device (e.g. powertool events,
eideticker events, orangutan input driver events), they could also be
reported over a socket to the event consumer.

2. Build a single consumer of the event stream.

Once we get event reporting over a socket, the next step is to build a
centralized consumer of the events.  NodeJS is probably a good choice
for this tool.  It should probably function as a daemon on the host
machine and be capable of detecting devices and executing adb commands
to set up the port forwarding automagically.  The event consumer will
listen on a single port--like a web server--and accept multiple incoming
connections from event sources.

3. Build a configurable dispatch mechanism for the event consumer.

The last piece involves building a configurable dispatch mechanism for
hooking up actions to certain events.  The first--and maybe the
only--action we will want is to gather the profiling data for the last N
seconds whenever we see an interesting event.

For instance, when I'm optimizing idle power draw, a power spike event
from the powertool should trigger dumping the profiling data for the
last N seconds.  I'll also want the profiling data when I see a
transient memory spikes and checkerboarding events, and...

Now that I think about it, grabbing the profiler data in response to a
marker added from JS would be tremendously helpful for debugging JS
event ordering and dispatching issues (i.e. the gaia window manager).

Proposal:

To keep this simple enough that it could be implemented quickly, I
propose a very simple event transmission mechanism.  The data on the
socket will flow in only one direction--from the event source to the
consumer.  Events will contain JSON encoded data: a timestamp with
sub-ms resolution, a subject string, and a body string.  The subject is
what triggers an action, the body is used for any optional/extended
data.  We'll use TCP.

Goal:

The ultimate goal is to ease the debugging and fixing of transient
events as well as giving us a more complete picture in Cleopatra by
graphing interesting events along with stack traces and memory and power
stats.

Thoughts?  I would really like to get some feedback on this.  If we
think this is a good idea, it will dominate the perf dev-tools effort
for the foreseeable future as it would be a tool useful for both Mozilla
engineers (e.g. fxos-perf team, gaia team, etc) and 3rd party app
developers.

-dave

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
dev-b2g mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-b2g

Reply via email to