Re: [Wiki-research-l] Geo-aggregation of Wikipedia page views: Maximizing geographic granularity while preserving privacy – a proposal

2015-01-13 Thread Oliver Keyes
I'm confused; john, could you point to the element of the collected data that isn't collected already by default in any Nginx or Apache setup? I agree that there might be a lack of user expectation, but 'silently capturing behavioral data' seems somewhat hyperbolic to describe what's actually

Re: [Wiki-research-l] preelminary results from the Wikipedia Gender Inequality Index project - comments welcome

2015-01-13 Thread Maximilian Klein
Thank you all for the feedback. I will have taken away quite a few good ideas for further investigation, to summarize: Gerard - look at the ratios of those bios of a language, which exist only in that language. Han Teng - male gaze hypothesis, create a by-profession crosstabular analysis. Jane -

Re: [Wiki-research-l] How many links did TWL account recipients add to Wikipedia with their access?

2015-01-13 Thread Gerard Meijssen
Hoi, These same people may have added content to Wikidata ... Obviously it has not been considered. However, you can query for these people there. You can also query how many external references were added by bot. It may provide the groundwork going to Wikipedias and find who did it .. references

[Wiki-research-l] Fwd: [Wikimedia-l] Introducing WikiProject X

2015-01-13 Thread Pine W
Forwarding. Pine -- Forwarded message -- From: James Hare jamesmh...@gmail.com Date: Jan 13, 2015 2:27 PM Subject: [Wikimedia-l] Introducing WikiProject X To: l...@lists.wikimedia.org, gender...@lists.wikimedia.org, wikimedi...@lists.wikimedia.org,

Re: [Wiki-research-l] [Analytics] Geo-aggregation of Wikipedia page views: Maximizing geographic granularity while preserving privacy – a proposal

2015-01-13 Thread John Mark Vandenberg
On Wed, Jan 14, 2015 at 9:22 AM, Andrew Gray andrew.g...@dunelm.org.uk wrote: Fair enough - I don't use it, and I think I'd got entirely the wrong end of the stick on what it's for! If it's intended to stop tracking by third-party sites then it certainly seems to be of little relevance here.

Re: [Wiki-research-l] [Analytics] Geo-aggregation of Wikipedia page views: Maximizing geographic granularity while preserving privacy – a proposal

2015-01-13 Thread Andrew Gray
Hi Dario, Reid, This seems sensible enough and proposal #3 is clearly the better approach. An explicit opt-in opt-out mechanism would not be worth the effort to build and would become yet another ignored preferences setting after a few weeks... A couple of thoughts: * I understand the reasoning

[Wiki-research-l] How many links did TWL account recipients add to Wikipedia with their access?

2015-01-13 Thread Jake Orlowitz
Hi all, There are 2000 editors who have received access to 20 different online databases. We know the usernames of these editors and the url prefixes of the websites they were given access to. We need to know: - from July 18th 2014 to January 11th 2014 - on English Wikipedia - for the cohort of

Re: [Wiki-research-l] [Analytics] Geo-aggregation of Wikipedia page views: Maximizing geographic granularity while preserving privacy – a proposal

2015-01-13 Thread Aaron Halfaker
Andrew, I think it is reasonable to assume that the Do not track header isn't referring to this. From http://donottrack.us/ with emphasis added. Do Not Track is a technology and policy proposal that enables users to opt out of *tracking by websites they do not visit*, [...] Do not track is

Re: [Wiki-research-l] preelminary results from the Wikipedia Gender Inequality Index project - comments welcome

2015-01-13 Thread Asaf Bartov
Re politicians, a trivial observation that is nonetheless better explicit than implicit: While the World Forum stats are presumably snapshot stats of some current(ish) point in time, Wikipedias cover politicians past and present. This would, of course, skew results as heavily as the patriarchal

Re: [Wiki-research-l] preelminary results from the Wikipedia Gender Inequality Index project - comments welcome

2015-01-13 Thread Stuart A. Yeates
I have a question about the P21. Has any of the GND author sex information leaked into P21? because that's known-bad data. It's bad because the GND in all it's wisdom decided to assign sex to authors based on a apparent gender of the name published under, even for periods when many women were

Re: [Wiki-research-l] preelminary results from the Wikipedia Gender Inequality Index project - comments welcome

2015-01-13 Thread Jane Darnell
Interesting, Magnus, thanks! After working on lots of the female names in various databases, I can also say that it is pretty difficult to scrape enough information together to produce a Wikipedia-worthy stub on many of the women mentioned in those databases. As you point out, we don't have

Re: [Wiki-research-l] [Analytics] Geo-aggregation of Wikipedia page views: Maximizing geographic granularity while preserving privacy – a proposal

2015-01-13 Thread Andrew Gray
Fair enough - I don't use it, and I think I'd got entirely the wrong end of the stick on what it's for! If it's intended to stop tracking by third-party sites then it certainly seems to be of little relevance here. (It might be worth clarifying this in the proposal, in case a future