Hi Adrian,

webrequest_source is a column by which data is partitioned (in addition to
year/month/day/hour columns)

To my knowledge, WDQS-related requests go into the 'misc' partitions in the
webrequest
<https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest>
table (from which this subset was extracted, while general traffic-related
requests like Wikipedia pageviews are in the webrequest_source = 'text'
partitions).

- Mikhail

On Mon, May 14, 2018 at 6:07 AM, Adrian Bielefeldt <
adrian.bielefe...@mailbox.tu-dresden.de> wrote:

> Thanks for the pointers. From what I can gather (especially
> wdqs_extract.hql) my next questions are:
> a) what exactly does "webrequest_source = 'misc'" mean and
> b) what source table this was extracted from
>
>
> On 08/05/18 09:22, Leila Zia wrote:
>
> A couple of pointers as Stas was not involved in the details of the
> extraction.
>
> Adrian: you can dig the history behind the extraction at
> https://phabricator.wikimedia.org/T146064
>
> Please also check the codes at https://gerrit.wikimedia.org/r/#/c/311964/
> for details, specifically wdqs_extract.hql .
>
> Best,
> Leila
>
>
>
> On Mon, May 7, 2018, 18:15 Andrew Otto <o...@wikimedia.org> wrote:
>
>> CCing Stas, he might know more.
>>
>> On Sun, May 6, 2018 at 9:58 AM, Adrian Bielefeldt <
>> adrian.bielefe...@mailbox.tu-dresden.de> wrote:
>>
>>> Hello everyone,
>>>
>>> I wanted to ask if anyone can tell me what wmf.wdqs_extract contains. I
>>> know generally that it is the query log of the SPARQL endpoint. However,
>>> I do not know if it is all requests, only uncached requests etc.
>>>
>>> If anyone knows or knows where I can read up on it that would be great.
>>>
>>> Greetings,
>>>
>>> Adrian
>>>
>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>
>
> _______________________________________________
> Analytics mailing 
> listAnalytics@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to