>Adding to this, one thing to consider is DNT - is there a way to invoke EL so that such traffic is appropriately imputed or something?
I am not sure what you are asking ... On Thu, Jan 18, 2018 at 1:57 PM, Adam Baso <[email protected]> wrote: > (I'd defer to the Readers Web team with Tilman on whether country > extracted from the cookie would be sufficient.) > > Adding to this, one thing to consider is DNT - is there a way to invoke EL > so that such traffic is appropriately imputed or something? > > -Adam > > On Thu, Jan 18, 2018 at 2:13 PM, Andrew Otto <[email protected]> wrote: > >> > In particular, will we be able to sort by country, OS, Browser, etc? >> OS, Browser, yes. User Agent parsing is done by the EventLogging >> processors. >> >> Country not quite as easily, as EventLogging does not include client >> IP addresses. We could consider putting this back in somehow, or, I’ve >> also heard that there is a geocoded country cookie that varnish will set >> that the browser could send back as part of the event. Is country enough >> geo detail? >> >> >> >> On Thu, Jan 18, 2018 at 2:30 PM, Olga Vasileva <[email protected]> >> wrote: >> >>> Hi all, >>> >>> I just want to confirm that the proposed method using Eventlogging will >>> allow us to gather data in a similar fashion to the web request table. In >>> particular, will we be able to sort by country, OS, Browser, etc? Our goal >>> here is to be able to consider the new page interactions metric on the same >>> level and with the same depth as pageviews. >>> >>> Thanks! >>> >>> - Olga >>> >>> On Thu, Jan 18, 2018 at 12:46 PM Andrew Otto <[email protected]> wrote: >>> >>>> > the beacon puts the record into the webrequest table and from there >>>> it would only take some trivial preprocessing >>>> ‘Trivial’ preprocessing that has to look through 150K requests per >>>> second! This is a lot of work! >>>> >>>> > tracking of events is better done on an event based system and EL is >>>> such a system. >>>> I agree with this too. We really want to discourage people from trying >>>> to measure things by searching through the huge haystack of all >>>> webrequests. To measure something, you should emit an event if you can. >>>> If it were practical, I’d prefer that we did this for pageviews as well. >>>> Currently, we need a complicated definition of what a pageview is, which >>>> really only exists in the Java implementation in the Hadoop cluster. It’d >>>> be much clearer if app developers had a way to define themselves what >>>> counts as a pageview, and emit that as an event. >>>> >>>> This should be the approach that people take when they want to measure >>>> something new. Emit an event! This event will get its own Kafka topic >>>> (you can consume this to do whatever you like with it), and be refined into >>>> its own Hive table. >>>> >>>> > I don’t want to have to create that chart and export one dataset >>>> from pageviews and one dataset from eventlogging to do that. >>>> If you also design your schema nicely >>>> <https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging/Schema_Guidelines>, >>>> it will be easily importable into Druid and usable in Pivot and Superset, >>>> alongside of pageviews. We’re working on getting nice schemas >>>> automatically >>>> imported into druid <https://gerrit.wikimedia.org/r/#/c/386882/>. >>>> >>>> >>>> >>>> >>>> On Thu, Jan 18, 2018 at 11:16 AM, Nuria Ruiz <[email protected]> >>>> wrote: >>>> >>>>> Gergo, >>>>> >>>>> >while EventLogging data gets stored in a different, unrelated way >>>>> Not really, This has changed quite a bit as of the last two quarters. >>>>> Eventlogging data as of recent gets preprocessed and refined similar to >>>>> how >>>>> webrequest data is preprocessed and refined. You can have a dashboard on >>>>> top of some eventlogging schemas on superset in the same way you have a >>>>> dashboard that displays pageview data on superset. >>>>> >>>>> See dashboards on superset (user required). >>>>> >>>>> https://superset.wikimedia.org/superset/dashboard/7/?presele >>>>> ct_filters=%7B%7D >>>>> >>>>> And (again, user required) EL data on druid, this very same data we >>>>> are talking about, page previews: >>>>> >>>>> https://pivot.wikimedia.org/#tbayer_popups >>>>> >>>>> >>>>> >I was going to make the point that #2 already has a processing >>>>> pipeline established whereas #1 doesn't. >>>>> This is incorrect, we mark as "preview" data that we want to exclude >>>>> from processing, see: >>>>> https://github.com/wikimedia/analytics-refinery-source/blob/ >>>>> master/refinery-core/src/main/java/org/wikimedia/analytics/r >>>>> efinery/core/PageviewDefinition.java#L144 >>>>> Naming is unfortunate but previews are really "preloads" as in >>>>> requests we make (and cache locally) and maybe shown to users or not. >>>>> >>>>> >>>>> But again, tracking of events is better done on an event based system >>>>> and EL is such a system. >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> [email protected] >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>> >>> >>> -- >>> Olga Vasileva // Product Manager // Reading Web Team >>> https://wikimediafoundation.org/ >>> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
