The data platform and tools teams are working on our core Telemetry system,
the data pipeline, providing core datasets and maintaining some central
data viewing tools.

To make new work more visible, we intend to provide quarterly updates.

What's new in the last few months?

On the data collection side, scalars
<https://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/telemetry/collection/scalars.html>
are now supported through the pipeline, so new flag and count histograms
are now disallowed on Desktop in favour of boolean and uint scalars.

Event Telemetry
<https://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/telemetry/collection/events.html>
is now ready for adoption. A general events table
<https://sql.telemetry.mozilla.org/queries/3415/source#table> is available,
a sync events table coming up and further uses are being looked at.

For documentation, we re-worked the guide for adding new Telemetry
<https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Adding_a_new_Telemetry_probe>
and extended the detailed data collection documentation
<https://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/telemetry/collection/>
.

The prototype for making probe history
<https://georgf.github.io/fx-data-explorer/> more discoverable now has
daily updates and supports Nightly too.

For filing or finding bugs, there is now a new Data Platform and Tools
<https://bugzilla.mozilla.org/describecomponents.cgi?product=Data%20Platform%20and%20Tools>
product. Note that client-side bugs still go into the separate
Toolkit::Telemetry
component
<https://bugzilla.mozilla.org/enter_bug.cgi?product=Toolkit&component=Telemetry>
.

The data pipeline work powers results for re:dash
<https://sql.telemetry.mozilla.org/> and custom analysis
<https://analysis.telemetry.mozilla.org/> among other things.

Notable recent work here includes:

   -

   Providing efficient lookup of client histories using Hbase
   
<https://python-moztelemetry.readthedocs.io/en/stable/userguide.html#module-moztelemetry.hbase>
   .
   -

   Experimental support for Zeppelin
   <https://mail.mozilla.org/pipermail/fhr-dev/2017-March/001210.html>, a
   new notebook type that improves Jupyter.
   -

   The Telemetry dashboard
   <https://telemetry.mozilla.org/new-pipeline/dist.html> is now faster
   through a dedicated read replica and client-side caching.
   -

   The Dataset API now has a select method
   
<http://python-moztelemetry.readthedocs.io/en/stable/userguide.html#moztelemetry.dataset.Dataset.select>
   to return a subset of fields.
   -

   Providing a framework for testable Python ETL jobs
   <https://github.com/mozilla/python_etl> generated from a template
   <https://github.com/harterrt/cookiecutter-python-etl>.
   -

   Direct-to-parquet
   
<https://mozilla-services.github.io/lua_sandbox_extensions/parquet/sandboxes/heka/output/s3_parquet.html>
   is in production, making easier to build datasets from incoming pings.


The data tools work powers tools that make data analysis more accessible
across Mozilla.

Updates here are:

   -

   For re:dash <https://sql.telemetry.mozilla.org/>, the UI improved to
   make the dashboard list more accessible.
   -

   re:dash query issues were reduced by handling failing queries using
   exponential back-off.
   -

   There is also a python re:dash client
   <https://github.com/mozilla/redash_client> (h/t to emtwo), allowing
   programmatic generation of queries and dashboards.
   -

   The distribution viewer <https://gauss.telemetry.mozilla.org/> is now
   live, making distributions of a set of important Firefox metrics available.


   -

   The analysis service <http://analysis.telemetry.mozilla.org/> gained
   features
   
<https://github.com/mozilla/telemetry-analysis-service/blob/master/WHATSNEW.md>
   like persistent cluster storage and the ability to extend cluster lifetimes.


Coming soon

For the next few months, interesting projects in the pipeline include:

   -

   Work to decrease data latency, by sending the last ping of a Firefox
   session immediately. We will also start sending timely pings for new users
   and updates.
   -

   Rebooting documentation
   
<https://docs.google.com/presentation/d/1zWbzDCNkM5tzR9K6WgO4vR7fpiuJDP-JBNLrYDsbeUA/edit#slide=id.g1d58c03b5b_0_1>,
   providing guidance as well as tying existing documentation together.
   -

   Start supporting new data collection from add-ons in Telemetry, starting
   with events.


Contact us

Please reach out to us with any questions or concerns.

You can find us on IRC in #telemetry and #datapipeline.

The main mailing list for data topics is fhr-dev
<https://mail.mozilla.org/listinfo/fhr-dev>.

Bugs can be filed in one of these components
<https://wiki.mozilla.org/Telemetry#Filing_Bugs>.

You can also find us on Twitter as @MozTelemetry
<https://twitter.com/moztelemetry>.
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to