Re: [Wiki-research-l] Sampling new editors in English Wikipedia

2019-03-12 Thread fn
Haifeng , While some suggests the dumps or notice boards, my immediate thought was a database query, e.g., through Quarry. It just happens that Jonathan T. Morgan has created a query there: https://quarry.wmflabs.org/query/310 SELECT user_id, user_name, user_registration, user_editcount

Re: [Wiki-research-l] Sampling new editors in English Wikipedia

2019-03-12 Thread Leila Zia
Let's do it. On Tue, Mar 12, 2019 at 3:04 PM Pine W wrote: > > Leila, can we discuss this off list? > > Thanks, > > Pine > ( https://meta.wikimedia.org/wiki/User:Pine ) > > > On Tue, Mar 12, 2019 at 9:29 PM Leila Zia wrote: > > > On Tue, Mar 12, 2019 at 1:56 PM Pine W wrote: > > > > > > Hi

Re: [Wiki-research-l] Sampling new editors in English Wikipedia

2019-03-12 Thread Pine W
Leila, can we discuss this off list? Thanks, Pine ( https://meta.wikimedia.org/wiki/User:Pine ) On Tue, Mar 12, 2019 at 9:29 PM Leila Zia wrote: > On Tue, Mar 12, 2019 at 1:56 PM Pine W wrote: > > > > Hi Leila, I believe that I asked for more information regarding Heifeng's > > work. > >

Re: [Wiki-research-l] Sampling new editors in English Wikipedia

2019-03-12 Thread Stuart A. Yeates
There are thousands and thousands of editors with multiple accounts. Those who have been bothered to add a category are listed at https://en.wikipedia.org/wiki/Category:Wikipedians_with_alternative_accounts Many editors who engage in outreach are advised to create new accounts for themselves

Re: [Wiki-research-l] Sampling new editors in English Wikipedia

2019-03-12 Thread Isaac Johnson
Yes, thanks for the clarification Stuart. I don't know of any statistics to suggest how widespread this is, but it might be worth checking, especially if you are focusing on editors with higher edit counts (who I suspect are more likely to have multiple accounts for licit reasons). On Tue, Mar

Re: [Wiki-research-l] Sampling new editors in English Wikipedia

2019-03-12 Thread Stuart A. Yeates
Note that this code deals with accounts, not editors, which is what Haifeng asked for. There are many reasons, both licit and illicit for editors to have more than one account. I know I have more than ten for policy-compliant reasons. cheers stuart -- ...let us be heard from red core to black

Re: [Wiki-research-l] Sampling new editors in English Wikipedia

2019-03-12 Thread Leila Zia
On Tue, Mar 12, 2019 at 1:56 PM Pine W wrote: > > Hi Leila, I believe that I asked for more information regarding Heifeng's > work. You stated "However, if you're planning to send surveys or messages to them, sending them barnstars, or otherwise manipulating their on-wiki experience, that would

Re: [Wiki-research-l] Sampling new editors in English Wikipedia

2019-03-12 Thread Isaac Johnson
Hey Haifeng, If you decide to process the dumps, you should be able to easily repurpose some quick code that I wrote for a similar project: https://github.com/geohci/miscellaneous-wikimedia/tree/master/editor-turnover Notably, I'd suggest using the stub history dumps as they are much smaller

Re: [Wiki-research-l] Sampling new editors in English Wikipedia

2019-03-12 Thread Pine W
Hi Haifeng, thanks for the information. I think that your idea of looking in the dumps makes sense. Am I understanding correctly that you would like advice regarding how to do that in the most efficient way? Hi Leila, I believe that I asked for more information regarding Heifeng's work. There has

Re: [Wiki-research-l] Sampling new editors in English Wikipedia

2019-03-12 Thread Haifeng Zhang
Pine and Stuart, I meant extracting a random sample of new editors (month by month) from Wikipedia edit history. It is not about survey of new editors, but still thanks for your suggestions. Thanks, Haifeng Zhang Postdoctoral Research Fellow Human-Computer Interaction Institute Carnegie

Re: [Wiki-research-l] Sampling new editors in English Wikipedia

2019-03-12 Thread Stuart A. Yeates
There are a number of new-editor-heavy noticeboards. I would suggest posting an invite there to your survey (or whatever) If you ask for editor's usernames you can filter out those who don't meet your definition of 'new' I'm thinking of places like:

Re: [Wiki-research-l] Sampling new editors in English Wikipedia

2019-03-12 Thread Leila Zia
Hi Pine, Haifeng has a simple question about how to sample editors other than via dumps. It would be great if someone who knows the answer to help them to move forward. If you are interested to learn more about their research, instead of answering their question, my recommendation would be to

Re: [Wiki-research-l] Sampling new editors in English Wikipedia

2019-03-12 Thread Pine W
Hi, can you expand on what you mean by "sample"? If you're referring to analyzing users' edit histories then that should be fine. However, if you're planning to send surveys or messages to them, sending them barnstars, or otherwise manipulating their on-wiki experience, that would be problematic.

[Wiki-research-l] Sampling new editors in English Wikipedia

2019-03-12 Thread Haifeng Zhang
Hi folks, My work needs to randomly sample new editors in each month, e.g., 100 editors per month. Do any of you have good suggestions for how to do this efficiently? I could think of using the dump files, but wonder are there other options? Thanks, Haifeng Zhang