Because of ongoing issues with data quality in Firefox Health Report
(FHR), we are planning on changing the way data is sent and stored on
the server.
Currently Firefox and the server try to cooperate in storing only one
record for each browser instance; they do this by creating a new upload
ID on each upload and removing the document from the prior upload ID. In
practice this system is not working: situations like backups, profile
copying, machine imaging, and other scenarios are causing "orphan"
documents in the FHR dataset, skewing many kinds of statistics that FHR
was designed to collect.
We are changing this system to upload using a single identifier per
browser instance for all uploads.
The original varying document ID was developed to improve user privacy.
In reality it doesn't help: to associate FHR data with a particular
person you would have to get the ID off of their client, but the client
already stores all of the relevant data. The rotating server ID doesn't
provide any additional privacy benefit.
For a transition period, we are going to add the stable identifier to
the FHR payload: this will allow us to reliably measure the orphaning
problem and give us time to add some additional error-handling and
logging code to the collection servers. After we've verified that the
stable ID isn't causing new problems, we will switch the client to
upload using the stable ID.
In the case of profile copying and machine imaging, we may end up in a
state where multiple Firefox profiles are uploading data to the same
identifier. This may not be a problem in practice, but if we do measure
this happening, we have a plan to help address the issue:
On upload, the server will compare the new data with existing data. If
the data doesn't match, the server will log the affected documents and
mark the upload ID as inactive. The next time any client tries to upload
to an inactive ID, the server will instruct the client to generate and
switch to a new random ID.
The client-side bug for the initial stage of this work is bug 968419.
Please direct any questions or concerns to the fhr-dev mailing list.
--BDS
_______________________________________________
governance mailing list
[email protected]
https://lists.mozilla.org/listinfo/governance