Hi Adrian, webrequest_source is a column by which data is partitioned (in addition to year/month/day/hour columns)
To my knowledge, WDQS-related requests go into the 'misc' partitions in the webrequest <https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest> table (from which this subset was extracted, while general traffic-related requests like Wikipedia pageviews are in the webrequest_source = 'text' partitions). - Mikhail On Mon, May 14, 2018 at 6:07 AM, Adrian Bielefeldt < adrian.bielefe...@mailbox.tu-dresden.de> wrote: > Thanks for the pointers. From what I can gather (especially > wdqs_extract.hql) my next questions are: > a) what exactly does "webrequest_source = 'misc'" mean and > b) what source table this was extracted from > > > On 08/05/18 09:22, Leila Zia wrote: > > A couple of pointers as Stas was not involved in the details of the > extraction. > > Adrian: you can dig the history behind the extraction at > https://phabricator.wikimedia.org/T146064 > > Please also check the codes at https://gerrit.wikimedia.org/r/#/c/311964/ > for details, specifically wdqs_extract.hql . > > Best, > Leila > > > > On Mon, May 7, 2018, 18:15 Andrew Otto <o...@wikimedia.org> wrote: > >> CCing Stas, he might know more. >> >> On Sun, May 6, 2018 at 9:58 AM, Adrian Bielefeldt < >> adrian.bielefe...@mailbox.tu-dresden.de> wrote: >> >>> Hello everyone, >>> >>> I wanted to ask if anyone can tell me what wmf.wdqs_extract contains. I >>> know generally that it is the query log of the SPARQL endpoint. However, >>> I do not know if it is all requests, only uncached requests etc. >>> >>> If anyone knows or knows where I can read up on it that would be great. >>> >>> Greetings, >>> >>> Adrian >>> >>> >>> _______________________________________________ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > > > _______________________________________________ > Analytics mailing > listAnalytics@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/analytics > > > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > >
_______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics