*Mission Control showing content crashes per 1k usage hours.*

As the data platform & tools team we provide core tools for using data to
other teams. This spans Firefox Telemetry, data storage and analysis to
some central data viewing tools. To make new developments more visible we
publish a quarterly update on our work.

In the last quarter we continued focusing on decreasing data latency,
supporting analytics and experimentation workflows, improving stability and
building Mission Control.
Let's go faster

To enable faster decision making, we worked on improving latency for
important use-cases.

Most notable is that the main pings now arrive much faster, which power
most of our main dashboards and analysis. The new rule of thumb is 2 days
until 95% of the main ping data
<https://chuttenblog.wordpress.com/2017/09/12/two-days-or-how-long-until-the-data-is-in/>
is available, from activity in the browser to being available for analysis.

In Firefox Telemetry we can now record new probes from add-ons without
having to ride the trains, which greatly reduces shipping times for
instrumentation. This is available first with events
<https://medium.com/georg-fritzsche/recording-new-telemetry-from-add-ons-61d194568212>
in 56 and scalars
<https://www.a2p.it/wordpress/tech-stuff/mozilla/recording-telemetry-scalars-from-add-ons/>
in 58.

The new update ping
<https://chuttenblog.wordpress.com/2017/10/04/anatomy-of-a-firefox-update/>
provides a lower-latency signal for when updates are staged and
successfully applied. It is queryable through the telemetry_update_parquet
dataset
<https://docs.telemetry.mozilla.org/datasets/batch_view/update/reference.html>
.

Similarly, the new-profile ping
<https://www.a2p.it/wordpress/tech-stuff/mozilla/getting-firefox-data-faster-introducing-the-new-profile-ping/>
is a signal for new profiles and installations, which is now queryable
through the telemetry_new_profile_parquet dataset
<https://docs.telemetry.mozilla.org/datasets/batch_view/new_profile/reference.html>
.

The new first-shutdown ping
<https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/telemetry/data/first-shutdown-ping.html>
helps us to better understand users that churn after the first session, by
sending the first sessions data of a user immediately on Firefox shutdown.
Enabling experimentation

This year saw a lot of cross-team work on enabling experimentation
workflows in Firefox. A focus was on enabling various SHIELD studies
<https://medium.com/@mgrimes/shield-studies-go-faster-bet-smarter-1010ae8d8e>
.

Here the experiments viewer <https://moz-experiments-viewer.herokuapp.com/>
saw a lot of improvements, which provides a front-end view for inspecting
how various metrics perform in an experiment.

An experiments dataset is now available in Redash and Spark, which includes
data for SHIELD opt-in experiments and is based on the main_summary dataset.

The experiment_aggregates dataset now includes metadata about the
experiment, and its reliability and speed have improved significantly.

Other use-cases can build on the ping data from most experiments using
experiment
annotations
<https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/telemetry/collection/experiments.html>,
which is available within 15 minutes in the telemetry-cohort data source
<https://gist.github.com/mreid-moz/7c0b32c9b9519f53d372a6c7b7af765d>.
Tools for exploring data

Our data tools make it easier to access and query the data we have. Here
our Redash installation at sql.telemetry.mozilla.org saw many improvements
including:

   -

   Query revision control and reversion.
   -

   Better security and usability for templated queries.
   -

   Schema browser and autocomplete usability and performance improvements.
   -

   Better support for Athena data sources.


Mission Control <https://wlach.github.io/blog/2017/10/mission-control/> is
a new tool, which makes key measures of a Firefox release, like crash
counts, available with low latency. An early version of it is now available
here <https://data-missioncontrol.dev.mozaws.net/#/>.

On the Firefox side, about:telemetry got a major redesign
<https://chuttenblog.wordpress.com/2017/08/30/the-photonization-of-abouttelemetry/>,
which makes it more easy to navigate, added a global search and aligns it
with the photon design.
Powering data analysis

To make analysis more effective, two new datasets were added:

   -

   clients_daily
   
<https://github.com/mreid-moz/firefox-data-docs/blob/6d5076cd2746639874704d8b00af431f4e60c5e7/datasets/mozetl/clients_daily/intro.md>,
   which summarizes main ping data into one row per client and day.
   -

   heavy_users
   <https://mail.mozilla.org/pipermail/fx-data-dev/2017-October/000071.html>
   (docs
   
<https://docs.telemetry.mozilla.org/concepts/choosing_a_dataset.html#heavyusers>),
   which has a similar format, but contains only clients that match our
   definition of "heavy users".


For analysis jobs run through ATMO <http://analysis.telemetry.mozilla.org/>,
the reliability was greatly improved, which resulted in a big decrease of
job failures.

Also, support was added for using of different EMR versions with different
ATMO installations, allowing us to test changes to our EMR configuration
much more thoroughly prior to deployment.
What is up next?

Some of the things that we will work on in the next months include:

   -

   Firefox 56 saw data preference changes in the UI, we will follow up to
   align some Telemetry behavior
   
<https://docs.google.com/document/d/12U_s9zHvpt7iGEMF5DazSzzx3whGbCnVWEsZTyclta0/>
   .
   -

   Databricks is being actively evaluated, with the goal of improving
   analysis productivity and reliability.
   -

   Further usability improvements to current Experiments Viewer, and
   significant work done on a ground-up rewrite
   <https://github.com/mozilla/firefox-test-tube>.
   -

   Providing a dataset for "one day retention" analysis.
   -

   A generic HTTP endpoint, moz_ingest, will be available to accept
   non-telemetry data. Data can be posted in any format but if it is JSON it
   can automatically tie into our schema validation capabilities.
   -

   Collaboration with the Activity Stream team on bringing our event
   pipelines together.
   -

   Activity stream is cross-checking & augmenting their experiment
   pipeline’s results with the *experiments_summary* dataset.

Get in touch

Please reach out to us with any questions or concerns.

   -

   You can find us on IRC in #telemetry and #datapipeline.
   -

   We are available on slack in #fx-metrics.
   -

   The main mailing list for data topics is fx-data-dev
   <https://mail.mozilla.org/listinfo/fx-data-dev>.
   -

   Bugs can be filed in one of these components
   <https://wiki.mozilla.org/Telemetry#Filing_Bugs>.
   -

   You can also find us on Twitter as @MozTelemetry
   <https://twitter.com/moztelemetry>.
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to