Researchers and EE specialists, your thoughts would be appreciated on this. I
started the thread only on Wikimedia-l to keep the discussion consolidated in
one place.
http://lists.wikimedia.org/pipermail/wikimedia-l/2014-May/071811.html
Thanks,
Pine
hi Pine,
this is an excellent point, and I believe there are definitely too few
systematic studies on the topic, as well as targeted programs.
blatant promotion mode on
In my book, Common Knowledge? An Ethnography of Wikipedia, which has
left the press last week, I have a whole chapter (Between
If your bot is only running automated reports in its own userspace then it
doesn't need a bot flag. But it probably wont be a very active bot so may
not be a problem for your stats
On the English language wikipedia you are going to be fairly close if you
exclude all accounts which currently have
That would cover most of them, but runs into the problem of you're only
including the unauthorised bots written poorly enough that we've caught the
operator ;). It seems like this would be a useful topic for some piece of
method-comparing research, if anyone is looking for paper ideas.
On 19 May
Brian Keegan, 18/05/2014 18:10:
Is there a way to retrieve a canonical list of bots on enwiki or elsewhere?
A Bots.csv list exists. https://meta.wikimedia.org/wiki/Wikistat_csv
In general: please edit
https://meta.wikimedia.org/wiki/Research:Identifying_bot_accounts
Nemo
Thanks for all the references and excellent advice so far!
I've looked into the Hale Anti-Bot Methodâ„¢, but because I've sampled my
corpus on articles (based on category co-membership), the resulting groupby
users gives these semi-automated users more normal distributions since
their other
the Hale Anti-Bot Methodâ„¢
That's a good one. =)
I'm a big fan of Scott's method
I second that. Again, great paper, Scott!
On Mon, May 19, 2014 at 5:31 PM, Aaron Halfaker aaron.halfa...@gmail.comwrote:
Another thought I had was that because many semi-automated tools such as
Twinkle and
Thanks all for the comments on my paper, and even more thanks to everyone
sharing these super helpful ideas on filtering bots: this is why I love the
Wikipedia research committee.
I think Oliver is definitely right that
this would be a useful topic for some piece of method-comparing research,