Thanks, Dan! I just expanded it even more (correcting my earlier error
confusing the privacy policy and data retention guidelines). See what you
think of this:

*Since this raw data identifies the location of individual editors, we keep
it for only 90 days, in accordance with our data retention guidelines
<https://meta.wikimedia.org/wiki/Data_retention_guidelines>. Data older
than 90 days is continuously purged from the source cu_changes table, but
since we regenerate the Data Lake's editing data every month
<https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster#Edit_data>,
we instead keep data in mediawiki_private_cu_changes and geoeditors_daily
for the two latest calendar months (the month of the latest
mediawiki_history
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Mediawiki_history>
snapshot and the previous). Older data may be temporarily available before
it is purged, but you should not rely on this. *

On Tue, 20 Nov 2018 at 12:43, Dan Andreescu <dandree...@wikimedia.org>
wrote:

> That's right, Neil, I just changed the language around a bit, thanks for
> updating that!
>
> On Tue, Nov 20, 2018 at 3:26 PM Neil Patel Quinn <nqu...@wikimedia.org>
> wrote:
>
>> Hey there!
>>
>> Could someone from Analytics clarify the purging schedule for
>> geoeditors_daily and add it on Wikitech
>> <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/Geoeditors#Generation>?
>> I've added some information based on my experience using the dataset, but
>> it may not be fully accurate.
>>
>> I wrote:
>> *Because these tables contain the countries of individual editors, we
>> only keep the data corresponding to the two most recent full months (the
>> month of the latest mediawiki_history snapshot and the previous).*
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to