> Are there any plans to integrate the connection type binary? (Sorry to
> ask endless questions, but this is my jam :D)
Oliver, you are our end user, guide us!



> On Feb 23, 2015, at 15:13, Oliver Keyes <[email protected]> wrote:
> 
> Neat! And those can then be accessed with say,
> geocoded_data['country_code'] in hive?
> 
> Are there any plans to integrate the connection type binary? (Sorry to
> ask endless questions, but this is my jam :D)
> 
> On 23 February 2015 at 15:00, Joseph Allemandou
> <[email protected]> wrote:
>> Oops sorry, I forgot to answer this question :)
>> A new map field named "geocoded_data" will contain, when available:
>> 
>> continent
>> country
>> country_code
>> subdivision
>> postal_code
>> city
>> timezone
>> latitude
>> longitude
>> 
>> For instance:
>> {"city":"Mukilteo","country_code":"US","longitude":"-122.3042","subdivision":"Washington","timezone":"America/Los_Angeles","postal_code":"98275","continent":"North
>> America","latitude":"47.913","country":"United States"}
>> 
>> Cheers
>> Joseph
>> 
>> On Mon, Feb 23, 2015 at 8:24 PM, Oliver Keyes <[email protected]> wrote:
>>> 
>>> Gotcha. So, for transparency...what are we calculating? Country? City? :D
>>> 
>>> On 23 February 2015 at 13:59, Joseph Allemandou
>>> <[email protected]> wrote:
>>>> As per the IRC discussion, we won't recompute historical data, but start
>>>> computing new values from the deploy time onward.
>>>> A new "version" field, and associated documentation will also be
>>>> provided,
>>>> allowing to follow changes along time.
>>>> Thanks for your inputs !
>>>> Best
>>>> 
>>>> 
>>>> On Mon, Feb 23, 2015 at 4:58 PM, Oliver Keyes <[email protected]>
>>>> wrote:
>>>>> 
>>>>> I think it should be fine-ish; it depends what we're calculating. When
>>>>> you say "geocoded information", what do you mean? Country? City? I
>>>>> wouldn't expect country to move about a lot in 60 days (which is the
>>>>> range of our data): I would expect city to.
>>>>> 
>>>>> What's the status on getting an oozie job or similar to compute going
>>>>> forward? To me that's more of a priority than historical data.
>>>>> 
>>>>> On 23 February 2015 at 10:53, Joseph Allemandou
>>>>> <[email protected]> wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> As part of my first assignment, I'll recompute our historical
>>>>>> webrequest
>>>>>> dataset, adding client_ip and geocoded information.
>>>>>> 
>>>>>> While it seems correct to compute historical client_ip based on the
>>>>>> existing
>>>>>> ip and the x_forwarded_for, the use of the current state of the
>>>>>> geocoded
>>>>>> maxmind library to compute historical data is more error-prone.
>>>>>> 
>>>>>> I can either compute it anyway, knowing that there'll be some errors,
>>>>>> or
>>>>>> put
>>>>>> null values for data older than a given point in time.
>>>>>> 
>>>>>> I'll launch the script to recompute the data as soon as max(a
>>>>>> consensus
>>>>>> is
>>>>>> find on this matter, operations gives me the right to run the script)
>>>>>> :)
>>>>>> 
>>>>>> Thanks
>>>>>> --
>>>>>> Joseph Allemandou
>>>>>> Data Engineer @ Wikimedia Foundation
>>>>>> IRC: joal
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Analytics mailing list
>>>>>> [email protected]
>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Oliver Keyes
>>>>> Research Analyst
>>>>> Wikimedia Foundation
>>>>> 
>>>>> _______________________________________________
>>>>> Analytics mailing list
>>>>> [email protected]
>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Joseph Allemandou
>>>> Data Engineer @ Wikimedia Foundation
>>>> IRC: joal
>>>> 
>>>> _______________________________________________
>>>> Analytics mailing list
>>>> [email protected]
>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Oliver Keyes
>>> Research Analyst
>>> Wikimedia Foundation
>>> 
>>> _______________________________________________
>>> Analytics mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> 
>> 
>> 
>> 
>> --
>> Joseph Allemandou
>> Data Engineer @ Wikimedia Foundation
>> IRC: joal
>> 
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> 
> 
> 
> 
> -- 
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation
> 
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics


_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to