Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-16 Thread Platonides
Alec Conroy wrote: I think I can build you something if you give me appropiate values for the above definition. Cheers Excellent-- so striking while the iron is hot-- I see that [[Special:Statistics]] defines active as edited within the last 30 days.I'm open to whoever many users we

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-16 Thread Diederik van Liere
Dear Alec, Maybe the Community Department can help you out with your question. We are doing a number of research sprints this summer to map out different aspects of the Wikipedia communities and this sounds like a great question and we have some researchers available to help write the queries. So

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-16 Thread Platonides
I have added a small script at http://www.toolserver.org/~platonides/activeusers/activeusers.php to show active users per project and language. Requisites for appearing there are more than 500 edits (total) and at least one action (usually an edit) in the last month (since May 16, data is

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-16 Thread Steven Walling
On Thu, Jun 16, 2011 at 9:44 AM, Platonides platoni...@gmail.com wrote: So my conclusion is that people stays on its home wiki, and it is very strange that someone passes 500 edits *both* on its wiki and in a foreign one. Agreed, I don't think this is a surprising result. If we can filter

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-16 Thread Jelle Zijlstra
You might also get better results when you don't limit yourself to recent contributions. For example, I contributed heavily to the Dutch Wikipedia a few years ago, and now contribute heavily to the English. I don't appear in Platonides's list, because I hardly edit nl: at all any more. There may

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-16 Thread Platonides
Jelle Zijlstra wrote: You might also get better results when you don't limit yourself to recent contributions. For example, I contributed heavily to the Dutch Wikipedia a few years ago, and now contribute heavily to the English. I don't appear in Platonides's list, because I hardly edit nl: at

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-16 Thread M. Williamson
I would say broaden the span and lower the number of contribs required just a little (maybe 300?). 2011/6/16 Platonides platoni...@gmail.com: Jelle Zijlstra wrote: You might also get better results when you don't limit yourself to recent contributions. For example, I contributed heavily to

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-16 Thread Thomas Morton
Or look for actives on one wiki.. and then cross check those names with all the other wikis for the same names with over, say, 300 edits (at any time). Tom On 16 June 2011 22:34, M. Williamson node...@gmail.com wrote: I would say broaden the span and lower the number of contribs required just

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-16 Thread Platonides
Thomas Morton wrote: Or look for actives on one wiki.. and then cross check those names with all the other wikis for the same names with over, say, 300 edits (at any time). Tom The edit count is are already looking at the full count, in the last month only one is needed.

[Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-15 Thread Alec Conroy
The recent elections showed us that language issues and translation are something we have to take very seriously from now on. As a first step towards improving communication, it seems like we should get an idea of which users speak which languages? We could directly ask them to tell us, but upon

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-15 Thread Aryeh Gregor
On Wed, Jun 15, 2011 at 8:46 AM, Alec Conroy alecmcon...@gmail.com wrote: We could directly ask them to tell us, but upon reflection, the information is already hidden in our database.  A multilingual user is one that actively edits two projects of different languages. That doesn't follow.

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-15 Thread Alec Conroy
Hi Aryeh, thanks for the fast reply. Yes, this will definitely underestimate linguistic capabilities of some users, and overestimate the linguistic capabilities of others--- it's a rough measure at best. But is there another way to try to get who how easily two languages should be able to

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-15 Thread Platonides
Alec Conroy wrote: Is there an easy way to run this: For each of the 86,000 'active users': Store a list for their edit counts on each project they've edited That's actually a fairly small dataset, and it would get us all the data we want. I've been a developer before, but never

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-15 Thread Niklas Laxström
On 15 June 2011 17:34, Alec Conroy alecmcon...@gmail.com wrote: The important point of doing this would be: 1) to identify those users with unique language skills and recruit them Recruit them to do what? 2) to identify projects and languages that are 'most disconnected' from the English hub,

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-15 Thread Thomas Morton
There is a lot of cross-wiki collaboration that can be done (whilst supporting the idea of wiki independence) and should be encouraged. Foundation work, cross-wiki translations of material, etc. Alec is largely talking about the board elections though, which was Anglo-centric and could have

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-15 Thread Alec Conroy
I think I can build you something if you give me appropiate values for the above definition. Cheers Excellent-- so striking while the iron is hot-- I see that [[Special:Statistics]] defines active as edited within the last 30 days.I'm open to whoever many users we can realistically get

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-15 Thread Alec Conroy
On Wed, Jun 15, 2011 at 7:42 AM, Platonides platoni...@gmail.com wrote: Alec Conroy wrote:   We could directly ask them to tell us, but upon reflection, the   information is already hidden in our database.  A multilingual user is   one that actively edits two projects of different languages.

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-15 Thread Alec Conroy
On Wed, Jun 15, 2011 at 8:08 AM, Niklas Laxström niklas.laxst...@gmail.com wrote: On 15 June 2011 17:34, Alec Conroy alecmcon...@gmail.com wrote: The important point of doing this would be: 1) to identify those users with unique language skills and recruit them Recruit them to do what?

Re: [Wikitech-l] How can I get data to map our linguistic interconnectedness?

2011-06-15 Thread Aryeh Gregor
On Wed, Jun 15, 2011 at 10:34 AM, Alec Conroy alecmcon...@gmail.com wrote: Is there an easy way to run this: For each of the 86,000 'active users':    Store a list for their edit counts on each project they've edited That's actually a fairly small dataset, and it would get us all the data