(I'd defer to the Readers Web team with Tilman on whether country extracted
from the cookie would be sufficient.)

Adding to this, one thing to consider is DNT - is there a way to invoke EL
so that such traffic is appropriately imputed or something?

-Adam

On Thu, Jan 18, 2018 at 2:13 PM, Andrew Otto <[email protected]> wrote:

> >  In particular, will we be able to sort by country, OS, Browser, etc?
> OS, Browser, yes.  User Agent parsing is done by the EventLogging
> processors.
>
> Country not quite as easily, as EventLogging does not include client
> IP addresses.  We could consider putting this back in somehow, or, I’ve
> also heard that there is a geocoded country cookie that varnish will set
> that the browser could send back as part of the event.  Is country enough
> geo detail?
>
>
>
> On Thu, Jan 18, 2018 at 2:30 PM, Olga Vasileva <[email protected]>
> wrote:
>
>> Hi all,
>>
>> I just want to confirm that the proposed method using Eventlogging will
>> allow us to gather data in a similar fashion to the web request table.  In
>> particular, will we be able to sort by country, OS, Browser, etc?  Our goal
>> here is to be able to consider the new page interactions metric on the same
>> level and with the same depth as pageviews.
>>
>> Thanks!
>>
>> - Olga
>>
>> On Thu, Jan 18, 2018 at 12:46 PM Andrew Otto <[email protected]> wrote:
>>
>>> > the beacon puts the record into the webrequest table and from there
>>> it would only take some trivial preprocessing
>>> ‘Trivial’ preprocessing that has to look through 150K requests per
>>> second! This is a lot of work!
>>>
>>> > tracking of events is better done on an event based system and EL is
>>> such a system.
>>> I agree with this too.  We really want to discourage people from trying
>>> to measure things by searching through the huge haystack of all
>>> webrequests.  To measure something, you should emit an event if you can.
>>> If it were practical, I’d prefer that we did this for pageviews as well.
>>> Currently, we need a complicated definition of what a pageview is, which
>>> really only exists in the Java implementation in the Hadoop cluster.  It’d
>>> be much clearer if app developers had a way to define themselves what
>>> counts as a pageview, and emit that as an event.
>>>
>>> This should be the approach that people take when they want to measure
>>> something new.  Emit an event!  This event will get its own Kafka topic
>>> (you can consume this to do whatever you like with it), and be refined into
>>> its own Hive table.
>>>
>>> >  I don’t want to have to create that chart and export one dataset
>>> from pageviews and one dataset from eventlogging to do that.
>>>  If you also design your schema nicely
>>> <https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging/Schema_Guidelines>,
>>> it will be easily importable into Druid and usable in Pivot and Superset,
>>> alongside of pageviews.  We’re working on getting nice schemas automatically
>>> imported into druid <https://gerrit.wikimedia.org/r/#/c/386882/>.
>>>
>>>
>>>
>>>
>>> On Thu, Jan 18, 2018 at 11:16 AM, Nuria Ruiz <[email protected]>
>>> wrote:
>>>
>>>> Gergo,
>>>>
>>>> >while EventLogging data gets stored in a different, unrelated way
>>>> Not really, This has changed quite a bit as of the last two quarters.
>>>> Eventlogging data as of recent gets preprocessed and refined similar to how
>>>> webrequest data is preprocessed and refined. You can have a dashboard on
>>>> top of some eventlogging schemas on superset in the same way you have a
>>>> dashboard that displays pageview data on superset.
>>>>
>>>> See dashboards on superset (user required).
>>>>
>>>> https://superset.wikimedia.org/superset/dashboard/7/?presele
>>>> ct_filters=%7B%7D
>>>>
>>>> And (again, user required) EL data on druid, this very same data we are
>>>> talking about, page previews:
>>>>
>>>> https://pivot.wikimedia.org/#tbayer_popups
>>>>
>>>>
>>>> >I was going to make the point that #2 already has a processing
>>>> pipeline established whereas #1 doesn't.
>>>> This is incorrect, we mark as "preview" data that we want to exclude
>>>> from processing, see:
>>>> https://github.com/wikimedia/analytics-refinery-source/blob/
>>>> master/refinery-core/src/main/java/org/wikimedia/analytics/
>>>> refinery/core/PageviewDefinition.java#L144
>>>> Naming is unfortunate but previews are really "preloads" as in requests
>>>> we make (and cache locally) and maybe shown to users or not.
>>>>
>>>>
>>>> But again, tracking of events is better done on an event based system
>>>> and EL is such a system.
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Analytics mailing list
>>>> [email protected]
>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>>
>>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>
>>
>> --
>> Olga Vasileva // Product Manager // Reading Web Team
>> https://wikimediafoundation.org/
>>
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to