As per the IRC discussion, we won't recompute historical data, but start computing new values from the deploy time onward. A new "version" field, and associated documentation will also be provided, allowing to follow changes along time. Thanks for your inputs ! Best
On Mon, Feb 23, 2015 at 4:58 PM, Oliver Keyes <[email protected]> wrote: > I think it should be fine-ish; it depends what we're calculating. When > you say "geocoded information", what do you mean? Country? City? I > wouldn't expect country to move about a lot in 60 days (which is the > range of our data): I would expect city to. > > What's the status on getting an oozie job or similar to compute going > forward? To me that's more of a priority than historical data. > > On 23 February 2015 at 10:53, Joseph Allemandou > <[email protected]> wrote: > > Hi, > > > > As part of my first assignment, I'll recompute our historical webrequest > > dataset, adding client_ip and geocoded information. > > > > While it seems correct to compute historical client_ip based on the > existing > > ip and the x_forwarded_for, the use of the current state of the geocoded > > maxmind library to compute historical data is more error-prone. > > > > I can either compute it anyway, knowing that there'll be some errors, or > put > > null values for data older than a given point in time. > > > > I'll launch the script to recompute the data as soon as max(a consensus > is > > find on this matter, operations gives me the right to run the script) :) > > > > Thanks > > -- > > Joseph Allemandou > > Data Engineer @ Wikimedia Foundation > > IRC: joal > > > > _______________________________________________ > > Analytics mailing list > > [email protected] > > https://lists.wikimedia.org/mailman/listinfo/analytics > > > > > > -- > Oliver Keyes > Research Analyst > Wikimedia Foundation > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > -- *Joseph Allemandou* Data Engineer @ Wikimedia Foundation IRC: joal
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
