Hi all,

Ismael, thanks so much for reaching out about this. Unfortunately, I think
Dan is right when he says that the granularity of data carries a big
privacy cost. We're working hard to try and lower the threshold of daily
unique visitors by country in order to be released from 1000 to 90, but it
seems like these Wikivoyage itineraries are likely to have less than 90
daily unique visitors in most countries. Either way, we're hoping to start
releasing daily pageviews by country in January, so you should check to see
if your pages are in the dataset once that release is live.

If you want unfettered access to the data (split up by country), you should
pursue a research partnership. Besides that, you can likely use some
existing tools (like the Pageviews API
<https://wikimedia.org/api/rest_v1/#/Pageviews%20data/get_metrics_pageviews_top__project___access___year___month___day_>
or
pageviews.wmcloud.org) to get a sense of the data. I'll be sure to reach
back out once the differentially-private data is released so that you might
be able to check on the relevant pages!

Thanks again for reaching out :)

Hal

On Wed, Dec 21, 2022 at 12:07 PM Dan Andreescu <[email protected]>
wrote:

> The only way is to help with the ongoing (and complex) differential
>>> privacy work <https://phabricator.wikimedia.org/T307245>
>>>
>>
>> I have systems background but probably this could be outside my skills.
>> How could I help?
>>
>
> Hm, it's some tricky programming work, I'm not 100% sure of the latest
> status or opportunities to get involved, but I'm cc-ing Hal Triedman to see
> if he has thoughts. (Hal see archive
> <https://lists.wikimedia.org/hyperkitty/list/[email protected]/thread/IKL3WOQ2UY7IMMCUTV7EYGT6PFVFLVCA/>
> )
>
> [1] https://meta.wikimedia.org/wiki/Research:Page_view#Resulting_format
>>>>
>>>
>>>  If you are indeed interested in pageviews, the definition you linked to
>>> talks about the data internally available.
>>>
>>
>> Oh!
>>
>>
>>>   Can I ask you to elaborate a bit more on why you need per-country data?
>>>
>>
>> Well, First I've been looking for the most useful tools and sources
>> available (and found very interesting many of them[1]). Second, in this
>> precise case we are running a pilot project in which has been published
>> some academic project results as Wikivoyage itineraries (3 in EN and 3 in
>> ES). These are the articles we are interested in tracking now.
>>
>> About the rationale, one of the bigger drivers nowadays is the well known
>> link between heritage, tourism and sustainability (example: the Sustainable
>> Development Goals), so there is a trend to better analyze this context to
>> study and plan. Usually touristic destinations have very well defined
>> countries of origin. The best you know the origin, the best you can plan.
>> Also there should be another positive impact in Wikimedia: new incentives
>> for institutions to create or translate articles to the relevant languages.
>> Always restrited to the heritage domain.  Here in Spain tourism is one of
>> the main economic sectors and anything providing intelligence would help
>> for better planning and conservation.
>>
>> Also, we have identified a new potential activity area about doing
>> intelligence analysis of trends in heritage (interest of the public,
>> changes in institutional focuses, new relevant practices, etc), not only
>> about the Spanish one but worldwide. This is also an scientific institution
>> and would find it very useful to collect the most precise traces available
>> (with absolute respect to the users privacy) to look for signals they could
>> use to refocus/prioritize their institutional goals.
>>
>> So, this is it.
>>
>> [1]  https://toolhub.wikimedia.org/lists/277
>>
>
> This is indeed a very interesting use case and a chance for this data to
> be very helpful.  Unfortunately to my naive eyes, this granularity of data
> also carries a big privacy cost.  The only way to get to it would be a
> research collaboration, but there are *lots* of requests for those and not
> enough researchers to help facilitate.  I'm honestly not sure there's an
> easy way around this... but I'll keep thinking about it and I know it'll be
> useful for Hal to see this kind of request and add it to his back burner.
> Thanks for detailing!
>
_______________________________________________
Analytics mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to