Daniel,

Singining an NDA is not enough to get access to the data, you also need to
be part of  a formal research collaboration with our research team, they
have a number of those and they are not likely to accept any more soon but
you can contact them on that regard:
https://www.mediawiki.org/wiki/Wikimedia_Research/Formal_collaborations

Thanks,

Nuria



On Mon, Jul 24, 2017 at 6:37 AM, Daniel Oberski <[email protected]>
wrote:

> Dear list,
>
> I'm posting a recent conversation with Dan below, as well as a few
> follow-up questions.
>
> Dan was kind enough to point out this list. I apologize that the post is
> "backward" (in
> email-thread format) due to my ignorance, will use this list from now on.
>
> Thanks, Daniel
>
>
> ----
>
> Hi Dan
>
>
> Thanks for getting back to me so quickly!
>
> >Thanks for writing.  In general these questions are best asked on our
> public list, so other
> >people can see and benefit from any answers: https://lists.wikimedia.org/
> mailman/listinfo/
> >analytics
>
> Thanks, I've joined this list and will ask subsequent questions there.
>
> >* pairs of pages: we have two datasets that are mentioned in this task
> https://
> >phabricator.wikimedia.org/T158972 which should be very interesting for
> this purpose.  They
> >aren't being updated right now, and the task is to do just that.  We'll
> probably get to
> >that within the next 3 months, but a bunch of us are on paternity leave
> this summer, so
> >things are a little slower than normal
>
> This seems close to what I need. From the descriptions I gather the
> linkage is by session.
> Is there also a linkage by ip (with IP's removed of course)?
>
> >* country data for pageviews: for privacy reasons we only allow access to
> this with an
> >NDA.  We have good data on it, but you need to sign this NDA and use our
> cluster to access
> >it, being careful about what you report about it to the world at large.
> Here's information
> >on that: https://wikitech.wikimedia.org/wiki/Volunteer_NDA
>
> I've read this and am happy to sign an NDA. I understand it is best to be
> as specific as
> possible about the reasoning, intentions with the data, and permissions
> required. For me to
> figure this out it would be useful to know the relevant parts of the
> database schema, and
> perhaps a hint as to which data might be most interesting there. Would you
> be able to point
> me towards that?
>
> >Hope that helps, and feel free to write back to the public list in the
> future.
>
> Definitely, very helpful and thank you!
>
> Best, Daniel
>
>
> On Wed, Jul 19, 2017 at 9:51 AM, Oberski, D.L. (Daniel) <[email protected]>
> wrote:
> Dear Dan,
>
>
> My name is Daniel Oberski, I'm an associate professor of data science
> methodology in the
> department of statistics at Utrecht University in the Netherlands.
>
> I've been using your incredibly useful pageviews API to study correlations
> between the
> amount of interest people show in a topic (pageviews) with other data such
> as political
> party preference over time. That has yielded some interesting results
> (which I have yet to
> write up).
>
> However, to do a better study it would be very helpful to have slightly
> more information
> than is in the API. Specifically, it would be very useful to be able to
> query, for each
> _pair_ of pages, how many people (or IP's) viewed _both_ of those pages.
> That way I can find
> out which pages are really indicative of interest in a specific common
> topic, rather than
> just correlated by accident. In addition, I've found it hard to figure out
> pageviews for
> specific pages by country rather than language.
>
> My question is, would you happen to know if is there any way to obtain
> this information?
> (does not necessarily have to be through the API.) Or do you know if there
> are people to
> whom I might talk about this?
>
> Thanks for reading (to) the end and best regards,
>
> Daniel
>
>
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to