> Are there any plans to integrate the connection type binary? (Sorry to > ask endless questions, but this is my jam :D) Oliver, you are our end user, guide us!
> On Feb 23, 2015, at 15:13, Oliver Keyes <[email protected]> wrote: > > Neat! And those can then be accessed with say, > geocoded_data['country_code'] in hive? > > Are there any plans to integrate the connection type binary? (Sorry to > ask endless questions, but this is my jam :D) > > On 23 February 2015 at 15:00, Joseph Allemandou > <[email protected]> wrote: >> Oops sorry, I forgot to answer this question :) >> A new map field named "geocoded_data" will contain, when available: >> >> continent >> country >> country_code >> subdivision >> postal_code >> city >> timezone >> latitude >> longitude >> >> For instance: >> {"city":"Mukilteo","country_code":"US","longitude":"-122.3042","subdivision":"Washington","timezone":"America/Los_Angeles","postal_code":"98275","continent":"North >> America","latitude":"47.913","country":"United States"} >> >> Cheers >> Joseph >> >> On Mon, Feb 23, 2015 at 8:24 PM, Oliver Keyes <[email protected]> wrote: >>> >>> Gotcha. So, for transparency...what are we calculating? Country? City? :D >>> >>> On 23 February 2015 at 13:59, Joseph Allemandou >>> <[email protected]> wrote: >>>> As per the IRC discussion, we won't recompute historical data, but start >>>> computing new values from the deploy time onward. >>>> A new "version" field, and associated documentation will also be >>>> provided, >>>> allowing to follow changes along time. >>>> Thanks for your inputs ! >>>> Best >>>> >>>> >>>> On Mon, Feb 23, 2015 at 4:58 PM, Oliver Keyes <[email protected]> >>>> wrote: >>>>> >>>>> I think it should be fine-ish; it depends what we're calculating. When >>>>> you say "geocoded information", what do you mean? Country? City? I >>>>> wouldn't expect country to move about a lot in 60 days (which is the >>>>> range of our data): I would expect city to. >>>>> >>>>> What's the status on getting an oozie job or similar to compute going >>>>> forward? To me that's more of a priority than historical data. >>>>> >>>>> On 23 February 2015 at 10:53, Joseph Allemandou >>>>> <[email protected]> wrote: >>>>>> Hi, >>>>>> >>>>>> As part of my first assignment, I'll recompute our historical >>>>>> webrequest >>>>>> dataset, adding client_ip and geocoded information. >>>>>> >>>>>> While it seems correct to compute historical client_ip based on the >>>>>> existing >>>>>> ip and the x_forwarded_for, the use of the current state of the >>>>>> geocoded >>>>>> maxmind library to compute historical data is more error-prone. >>>>>> >>>>>> I can either compute it anyway, knowing that there'll be some errors, >>>>>> or >>>>>> put >>>>>> null values for data older than a given point in time. >>>>>> >>>>>> I'll launch the script to recompute the data as soon as max(a >>>>>> consensus >>>>>> is >>>>>> find on this matter, operations gives me the right to run the script) >>>>>> :) >>>>>> >>>>>> Thanks >>>>>> -- >>>>>> Joseph Allemandou >>>>>> Data Engineer @ Wikimedia Foundation >>>>>> IRC: joal >>>>>> >>>>>> _______________________________________________ >>>>>> Analytics mailing list >>>>>> [email protected] >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Oliver Keyes >>>>> Research Analyst >>>>> Wikimedia Foundation >>>>> >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> [email protected] >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>>> >>>> >>>> -- >>>> Joseph Allemandou >>>> Data Engineer @ Wikimedia Foundation >>>> IRC: joal >>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>> >>> >>> >>> -- >>> Oliver Keyes >>> Research Analyst >>> Wikimedia Foundation >>> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> >> >> >> -- >> Joseph Allemandou >> Data Engineer @ Wikimedia Foundation >> IRC: joal >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > > > > -- > Oliver Keyes > Research Analyst > Wikimedia Foundation > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics _______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
