Thanks. Another question: For some countries, the result is "-", for example Germany:
Germany - en.wikipedia 1275634 Any idea why? (I modified the query a bit and added the "project" column. And yes, the fact that en.wikipedia is at the top in Germany is also quite odd.) -- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore 2018-07-09 15:17 GMT+03:00 Francisco Dans <[email protected]>: > I think as long as you put in a filter so that the minimum pageviews is > maybe 1000, you should be fine privacy wise. I can't speak too much to your > second question. > > On Mon, Jul 9, 2018 at 1:59 PM, Amir E. Aharoni < > [email protected]> wrote: > >> Thank you so much! In many countries it's >> >> A couple of questions: >> 1. Are any of the results of this query private? Or can I talk about them >> to people? >> 2. Is anything like this already published anywhere? If it isn't, it may >> be nice to publish such a thing, similarly to Google Zeitgeist. >> >> >> -- >> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי >> http://aharoni.wordpress.com >> “We're living in pieces, >> I want to live in peace.” – T. Moore >> >> 2018-07-09 13:19 GMT+03:00 Francisco Dans <[email protected]>: >> >>> Hi Amir, >>> >>> As Tilman has suggested, your best bet is to query the pageview_hourly >>> table. I was going to be lazy and give you a query to just find out the >>> most viewed article for a given country, but then I made a few experiments >>> and this is the query I came up with to generate a list of countries and >>> their respective most viewed articles and view counts. It takes a few >>> minutes to run for a single day, so I'm sure someone here could suggest a >>> better approach. >>> >>> WITH articles_countries AS ( >>>> SELECT country, page_title, sum(view_count) AS views >>>> FROM pageview_hourly >>>> WHERE year=2018 AND month=3 AND day=15 >>>> GROUP BY country, page_title >>>> ) >>>> SELECT s.country as country, s.page_title as page_title, s.views as >>>> views >>>> FROM ( >>>> SELECT max(named_struct('views', views, 'country', country, >>>> 'page_title', page_title)) as s from articles_countries group by country >>>> ) t; >>> >>> >>> Cheers / see you in ZA, >>> Fran >>> >>> >>> On Mon, Jul 9, 2018 at 10:18 AM, Amir E. Aharoni < >>> [email protected]> wrote: >>> >>>> Hi, >>>> >>>> Is there a way to find what are the most popular articles per country? >>>> >>>> Finding the most popular articles per language is easy with the >>>> Pageviews tool, but languages and countries are of course not the same. >>>> >>>> One thing I tried is going to Turnilo, webrequest_sampled_128, and >>>> filtering by country. But here it gets troublesome: >>>> * Splitting can be done by Uri host, which is *more or less* the >>>> project, or by Uri path, which is *more or less* the article (but see >>>> below), and I couldn't find a convenient way to combine them. >>>> * Mobile (.m.) and desktop hosts are separate. It may actually >>>> sometimes be useful to see differences (or lack thereof) between desktop >>>> and mobile, but combining them is often useful, too. This can probably be >>>> done with regular expressions, but this brings us to the biggest problem: >>>> * Filtering by Uri path would be useful if it didn't have so many paths >>>> for images, beacons, etc. Filtering using the regular expression >>>> "\/wiki\/.+" may be the right thing functionally, but in practice it's very >>>> slow or doesn't work at all. >>>> * I don't know what exactly is logged in webrequest_sampled_128, but >>>> the name hints that it doesn't include everything. A sample may be OK for >>>> countries with a lot of traffic like U.S. or Spain, but for countries with >>>> smaller traffic this may start being a problem. >>>> >>>> Any better ideas? >>>> >>>> Thanks! >>>> >>>> -- >>>> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי >>>> http://aharoni.wordpress.com >>>> “We're living in pieces, >>>> I want to live in peace.” – T. Moore >>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>> >>> >>> -- >>> *Francisco Dans* >>> Software Engineer, Analytics Team >>> Wikimedia Foundation >>> >>> _______________________________________________ >>> Analytics mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >>> >> >> _______________________________________________ >> Analytics mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > > -- > *Francisco Dans* > Software Engineer, Analytics Team > Wikimedia Foundation > > _______________________________________________ > Analytics mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/analytics > >
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
