Hi, As part of my first assignment, I'll recompute our historical webrequest dataset, adding client_ip and geocoded information.
While it seems correct to compute historical client_ip based on the existing ip and the x_forwarded_for, the use of the current state of the geocoded maxmind library to compute historical data is more error-prone. I can either compute it anyway, knowing that there'll be some errors, or put null values for data older than a given point in time. I'll launch the script to recompute the data as soon as max(a consensus is find on this matter, operations gives me the right to run the script) :) Thanks -- *Joseph Allemandou* Data Engineer @ Wikimedia Foundation IRC: joal
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
